Implementing I2C Slave on an FPGA/CPLD

After pounding my brain with books and training, it is time to start a real FPGA/CPLD project.

The project will involve a Teensy 3.1 sending commands to an Altera Max II CPLD to control connected devices (a lot of LEDs). I partially selected the Teensy because it is 3.3V which is pretty much a necessity since the Max II is not 5V tolerant. Also, the teensy has a Cortex-M4 processor running at 72MHz. Way  faster than an AVR based Arduino.

The first step in my project is to implement I2C so I can communicate with the FPGA as a slave device from an Arduino. After a little research I found this VHDL-based I2C Slave component.

During the ‘research’ phase of learning design languages, I read 2 college text books each for Verilog and VHDL. After wading through all of that material, it was a toss-up which language I should go with. Neither was particularly easy for a rank amateur.

I went looking for comparisons between the two. The most pertinent difference I could find is that VHDL is harder to get a good compile because of strict type checking. But once it compiles, the likelihood of it working is higher. Given my preference for Pascal and it’s strict type checking (which I like because it clearly keeps me out of trouble C doesn’t), I decided VHDL will be the HDL for me.

The I2C Slave design is the most complex design I have studied to date, so I spent a weekend reading through the code repeatedly until I understood almost everything occurring. Reminded me of my days as a jr. programmer, carefully studying production Fortran and COBOL programs trying to understand the nuances of those languages (though some might say there is no nuance in either).

With a reasonable understanding of the code, I was ready to try to implement the design and its test bench. It nearly compiled perfect under Altera’s Quartus II. The only change necessary was to assign a default value to the slave address:


With a good compile, I sat down and started learning how to carefully monitor the design with the Test Bench.

I am still a little unclear on when signals update inside a process. I think I know, yet sometimes signals don’t update when I think they should. I finally found the best way for me to figure out what was going on was to execute complete statements in the test bench and look at the signals there. Trying to step thru code inside of the process was confusing given my current knowledge.

Once I realized this, I found the test bench worked perfect. The component works great. Pretty much took 8 hours to prove this to myself.

Since this is my first real design, not largely copied out of a book, I wanted to keep it is absolutely simple as possible. I need to know I can produce a design, load it into the CPLD, access it properly from the Teensy and see valid signals on a logic analyzer.

So the first test  is to simply read a byte from the CPLD which doesn’t get much simpler.

The code was pretty simple:


Ha! I spent nearly an entire day trying to understand why, even though the report statement was executed, data_to_master was never set until after the master read the data.

The answer was in the readme file the whole time (which I read early on, but didn’t fully appreciate):

“Clock stretching is not implemented, so make sure that data to master is ready upon request.”

This is no time between read_req going high and the register being transferred to the master. Instead I need to make sure data_to_master has the appropriate value when the read request occurs:


With that change, the test bench started working. Within an hour the design was on the Max II CPLD, and an I2C program to request the byte was running. After 3 long days, the design was running!


With the first design running, I was ready to increase the complexity a bit. So today’s test was to write a byte to the Max II, have it increment the byte by 1, and then read it back so I could compare it.

Now that I have a clue, the second design came MUCH faster. It was up and running in < 2 hours. Here is the full VHDL listing:

library ieee;
   use ieee.std_logic_1164.all;
   use ieee.numeric_std.all;
   use std.textio.all;
   use work.txt_util.all;

entity myI2cEchoTest is
   port (
      scl               : inout std_logic;
      sda               : inout std_logic;
      clk               : in    std_logic;
      rst               : in    std_logic
end myI2cEchoTest;

architecture RTL of myI2cEchoTest is

signal read_req         : std_logic                      := '0';
signal data_to_master   : std_logic_vector (7 downto 0)  := "01010101";
signal data_valid       : std_logic                      := '0';
signal data_from_master : std_logic_vector (7 downto 0)  := (others => '0');
signal data_reg: std_logic_vector (7 downto 0);

component I2C_slave is
  generic (
    SLAVE_ADDR          : std_logic_vector(6 downto 0)   := "0000000"); -- I added := "0000000" to get it to compile
  port (
    scl                 : inout std_logic;
    sda                 : inout std_logic;
    clk                 : in    std_logic;
    rst                 : in    std_logic;
    -- User interface
    read_req            : out   std_logic;
    data_to_master      : in    std_logic_vector(7 downto 0);
    data_valid          : out   std_logic;
    data_from_master    : out   std_logic_vector(7 downto 0));
end component I2C_slave;


i2cSlave: I2C_slave 
   generic map (
      SLAVE_ADDR => "0000011"
   port map(
      scl               => scl,
      sda               => sda,
      clk               => clk,
      rst               => rst,
      -- User interface
      read_req          => read_req,
      data_to_master    => data_to_master,
      data_valid        => data_valid,
      data_from_master  => data_from_master
process (clk) 
if rising_edge(clk) then
   if data_valid = '1' then
      data_to_master <= std_logic_vector(unsigned(data_from_master) + 1);
      end if;
   end if;
end process;
end architecture rtl;

The test bench contains everything in the I2CSlave test bench (all of the declarations and procedures) except for the main code which is very simplistic:

    scl_test <= '1';
    sda_test <= '1';
    i2c_write("0000011", "11001100");
    wait for 100 ns;

    print("i2c_read test");
    i2c_read("0000011", received_data);
    report "Received data: " & hstr(received_data);
    state_dbg <= 6;

    wait until rising_edge(clk_test);
    assert false report "simulation completed successfully" severity failure;
  end process;
end Testbench;

The Teensy code is straight forward as well:

#include <Wire.h>

void setup()
  Wire.begin();        // join i2c bus (address optional for master)
  Serial.begin(9600);  // start serial for output
  while (!Serial.dtr()) {}
  Serial.print("program begins");

void loop()
  byte c;
  byte i;
  i = 0;
  while (true) {
    Wire.requestFrom(3, 1);    // request 6 bytes from slave device #2
    while(!Wire.available()) {}
    c =; // receive a byte as character
    Serial.print("Write: ");
    Serial.print(i);         // print the characterc
    Serial.print("; Read: ");
    Serial.print(c);         // print the characterc
    i = i + 1;
    if (i == 255) {
      Serial.print("Press any key to repeat");
       while (!Serial.available()) {}
      while (Serial.available());


At this point I’m confident I can communicate with the Max II CPLD and I can focus on the next phase of the project – sending commands and byte strings to the CPLD.

This entry was posted in c-fpga and tagged , , . Bookmark the permalink.

2 Responses to Implementing I2C Slave on an FPGA/CPLD

  1. Jogi says:

    Hey Dan, first of all – thank you so much for sharing this nice blog post with us. I’m currently doing my first steps in the FPGA world and this is the only working i2c slave core implementation I was able to find (which I almost understand😉 )

    Finally I have one lasting question: your example is working like charm as an 8 Bit IO Expander – Right now I’m struggling to extend this sample code to work as an 16 Bit IO Expander. (the second byte will be transferred by continuing clocking… so 24 clocks will be used to read out the 16 Bit expander.)

    Currently I’m thinking of kind of notification system, which raises a data_byte_0_transfer_finished signal, which then initiates the transfer of the next 8 Bits to the output array…

    Do you have any hint for me how I could implement it the most simple way?

    Thanks and regards,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s