Creating a Clap-Controlled Arduino Remote

I’ve recently been playing around with an Arduino Uno and I thought I’d share some of what I’ve learned.

The Arduino Uno board. Great for first-time makers.

Introduction

I’ve been interested for a while in getting into hardware but I never really got around to it. A few weeks ago, I just said screw it and bought a small Arduino starter kit with the Uno, a small breadboard, some jumper wires, LEDs, resistors, and various sensors: https://www.amazon.com/gp/product/B01DGD2GAO/ref=oh_aui_detailpage_o01_s00?ie=UTF8&psc=1

For a while, my dad has been asking me (somewhat jokingly) to make him a voice-controlled remote for the TV. It sounded interesting but I think that ASR is a little above my paygrade. Maybe I’ll try it sometime in the future.

We settled on making a clap-controlled remote, something that sounded more doable than recognizing speech commands. In this post, I’ll describe the process of both writing the software and setting up the circuit for the remote. I hope you’ll learn something as I did making it.

Reading the Remote Commands

I would suspect that most readers know that TV remotes send commands to the TV via infrared. The TV manufacturer has to choose how to encode each digital command (i.e. power, channel up/down) into the analog signal output by the infrared diode.

Sony SIRC Protocol

I have a Sony TV which uses the SIRC protocol for transmitting remote commands. Which TV you have greatly affects how you decode and read the signals from the remote.

The SIRC protocol uses pulse-width modulation or PWM to encode 0’s and 1’s. A 0 is represented by a 600 microsecond pulse followed by 600 microsecond space. A 1 is a 1.2 millisecond pulse (twice that of a 0) followed by the same wait time.

Each command has an address appended to it which signified what type of device the command is going to (TV, VCR, CD player, etc.).

There are three versions of this protocol: the 12-bit, 15-bit, and 2-bit versions. In each version the length of the actual command is 7 bits, the only thing that changes between versions is the length of the address.

My TV uses the 12-bit versions so that’s the one I will talk about. In the 12-bit version, the address is 5 bits long and for a TV, it is 00001. Each command takes the following form:

Sony SIRC 12-bit protocol

As the picture describes, you start with a 2.4 millisecond pulse (followed by the standard 600 microsecond wait time). Then you transmit the command followed by the address. Both the command and address are little-endian.

For a more in-depth explanation of the protocol, here’s a great article about it: https://www.sbprojects.net/knowledge/ir/sirc.php

Using a TSOP

To read the infrared signal coming from the remote, we need an infrared receiver. You could probably use just a regular IR diode, but it’s much easier to buy a TSOP, an IR receiver than has some extra circuitry to deal with gain and noise-reduction. For this project I bought a TSOP4838.

Diagram of the TSOP

As you can see above, the TSOP has three pins: one for power, one to ground, and one to read the output on. Just what we need.

Let’s hook up the TSOP to our Arduino. You need the +5V pin to be connected to Vs, and connect the GND pin to ground (obviously). Then you need to connect OUT to a pin for reading. I initially thought this would have to be one of the analog pins but it turns out that you can only read from the module when it’s connect to one of the PWM-enabled digital pins. Therefore, we’ll use pin 11.

Schematic for reading from a TSOP

Decoding the Signal

Now that we can read from the IR receiver, we need to decode it to see what the commands are. Now we could use Ken Schiriff’s IRremote library, which is really handy. But, it’s more fun to write it ourselves.

Setting up the pins:

int irPin = 11; void setup() { 
Serial.begin(9600);
pinMode(irPin, INPUT);
}

We will need access to the serial monitor later to read the commands, so we start transmission at 9600 bps.

Now, let’s write the code to decode the SIRC commands:

int commandMask = 127;void loop() {
duration = pulseIn(irPin, LOW);
if (duration >= 2400) { // start burst
int bitnum = 0;
int bitstring = 0;
int duration = pulseIn(irPin, LOW);

while (duration >= 600 && duration < 2000) {
int currbit = 0;
if (abs(duration-1200)<abs(duration-600))
currbit = 1;

bitstring = (currbit << bitnum) | bitstring;
++bitnum;
duration = pulseIn(irPin, LOW);
}
int address = bitstring >> 7;
int command = bitstring & commandMask;
Serial.print(“Address: “);
Serial.print(address);
Serial.print(“, Command: “);
Serial.println(command);
}
}

The pulseIn function returns the duration it takes for the pin in the first argument to go the second argument and back again. Therefore, pinMode(irPin, LOW) returns the time (in microseconds) it takes for the value on pin 11 to go from HIGH to LOW to HIGH.

If we detect a pulse at least 2400 microseconds, we know we have received the start burst of the SIRC protocol.

We next initialize a few integers, bitnum and bitstring. bitnum is the number of bits currently read and bitstring is the command currently read. Then we keep reading until we pulses until we have a duration outside of the allowed range of values (I use 2000 instead of 2400 for the upper bound because I must’ve had some interference).

For each pulse, if the length is closer to 1200 than 600, we decide that we received a logical 1. Otherwise it is a 0. We then set the bitnum-th bit of the bitstring to the logical bit read. Then increment the number of bits read and re-read a pulse.

bitstring should now be 12 bits with the 5 most significant bits being the address and 7 lower bits being the command.

NOTE: Apparently, each command is transmitted three times, approximately 45 milliseconds apart. Therefore, we will should print out the commands thrice.

Uploading the code to the Arduino, pointing the remote at the TSOP, and pressing the corresponding buttons, we can record the commands we need. For my purposes, all I need are power (21), channel up/down (16/17), and volume up/down (18/19).

Transmitting the Remote Commands

Now, again we could use the IRremote library, but it’s more fun to implement this ourselves. The transmitting is surprisingly much more involved than the receiving.

Transmission is complicated since we need to operate at a more refined frequency (40kHz) and duty-cycle (1/4) than can be achieved using the built-in Arduino delayMicroseconds and digitalWrite functions.

Duty Cycle

The discussion before about the pulse for a logical 0 being 600 microseconds long and 1200 for a 1 was incomplete. These pulses are not HIGH for 100% of the burst time. They are parameterized by their duty cycle:

Diagram representing different duty cycles

The duty cycle of a PWM signal is the ratio of amount of time in HIGH to the time that one cycle takes. The SIRC protocol uses a duty cycle of 25% at 40kHz. If we were using delayMicroseconds, we would need to toggle the value on the output pin every 6.25 microseconds. delayMicroseconds doesn’t offer sub-millisecond resolution.

Arduino Timers

Thankfully, there is another way.

The ATmega328p comes with three timer/counter registers for PWM. The basic mechanism behind these timers is that they increment a counter and at a regular interval, increment this counter until a maximum value is hit. Once this value occurs, the timer restarts. This value determines the frequency of the PWM wave.

There is also a register to modify the duty cycle of the PWM wave. When this value is hit by the counter, a value of HIGH is put on an output register and returned to LOW at a time thereafter.

There are different number of operating modes for the timers. These modes modify attributes such as how the timer resets after the limit is hit or the how the top limit is specified.

For a detailed description of all operating modes, take a look at: http://www.righto.com/2009/07/secrets-of-arduino-pwm.html. The operating mode, or waveform generation mode (WGM), that we will use is called Phase-Correct PWM.

Phase-Correct PWM

Timing diagram for Phase-Correct PWM

In Phase-Correct PWM, the timer continually counts up until the max value is hit and then decrements until the counter is at zero. Normally, the max value is the width of the counter register (255 for timers 0 and 2, 65535 for timer 1). However, we can put our own max value into register OCRnA. With the timer registers, the lowercase n is a number 0, 1, or 2 corresponding to one of the timers.

By setting OCRnB, we modify the duty cycle. In Phase-Correct PWM, when the counter goes below the value in OCRnB, a HIGH voltage is put onto the output pin OCnB, and it returns to low when the counter goes above OCRnB. For Timer 2 (the timer we will be using), OC2B is pin 3. From now on, the n in all register names will be replaced with 2.

I initially tried to use Timer 0 however nothing was working. After much digging, I found that Timer 0 is actually used by delay and delayMicroseconds! Using this timer for our custom PWM breaks those functions.

While the frequency of our PWM is inversely proportional to the value in OCR2A, it does not complete specify the frequency. The actual formula for the frequency is 16MHz/(2 * prescaler * OCR2A). 16MHz is the frequency of the CPU for the ATmega328p while the prescaler is either 1, 8, 32, 64, 128, 256, or 1028 and is specified by the three least significant bits of register TCCR2B.

For our purposes, we will use a prescaler of 1, so to achieve a frequency of 40kHz, OCR2A must be set to 200; 16MHz/(2*200) = 40kHz. Then to get our desired duty cycle of 25%, we must set of 50 or 200/4.

Transmitting the Signal

Now it’s time to code. Let’s first define some constants:

const int startBurstLength = 2400;
const int oneBurstLength = 1200;
const int zeroBurstLength = 600;
const int waitLength = 600;
const byte topVal = 200;
const byte dutyCycle = topVal/4;

The burst lengths are what we will use to transmit 0’s or 1’s according to SIRC and topVal and dutyCycle are the values discussed earlier for generating the correct PWM pulse train.

The setup is similar to that of receiving but now pin 3 is used for output:

void setup() {
Serial.begin(9600);
pinMode(3, OUTPUT);
}

We will now write or loop function to transmit the codes we send to the Arduino via the serial monitor:

void loop() {
if (Serial.available()) {
command = Serial.readString().toInt();
if (command == 0) return;

for (int i=0; i<3; i++) {
transmit(command);
delay(40);
}
}
}

We transmit each command three times with a 40ms delay as the SIRC protocol specifies.

Now, we have to write the function to transmit the command:

void mark(int delayUs) {
TCCR2A |= (1<<COM2B1);
delayMicroseconds(delayUs);
space(waitLength);
}
void space(int delayUs) {
TCCR2A &= ~(1<<COM2B1); // disable output on b
delayMicroseconds(delayUs);
}
void sendTVAddress() {
mark(oneBurstLength);
for (char i=0; i<4; i++) {
mark(zeroBurstLength);
}
}
void sendCommand(byte command) {
TIMSK2 &= ~(1<<TOIE2);

digitalWrite(3, LOW);

OCR2A = topVal;
OCR2B = dutyCycle;
TCCR2A = (1<<WGM20);
TCCR2B = (1<<WGM22) | (1<<CS20);
mark(startBurstLength);

for (char i=0; i<7; i++) {
if (command % 2 == 0) mark(zeroBurstLength);
else mark(oneBurstLength);
command >>= 1;
}
}
void transmit(byte command) {
sendCommand(command);
sendTVAddress();
}

We use two procedures to transmit the command, sendCommand followed by sendTVAddress. sendCommand first clear the TOIE2 bit of TIMSK2 which disables interrupts on Timer 2. If we did not do this, our PWM could get interrupted by the system, invalidating the command. As an aside, we use the 1<<(OFFSET) syntax to set the OFFSET-th bit of a register.

We then put a LOW value on just to be safe.

As discussed earlier, topVal (200) goes into register OCR2A and dutyCycle (50) into OCR2B.

We set WGM20 in TCCR2A and WGM22 in TTCR2B to signify that we want to use Phase-Correct PWM. Below is the chart for which WGM bits to set for a certain WGM:

WGM Bit Description

We have to split the WGM bits between the two registers because this is how TCCR2A/B are structured:

Structure of the TCCR2A register
Structure of the TCCR2B register

These registers are packed with flags and parameters, so Atmel had to split up the three WGM bits.

The last thing to mention about the register initialization in sendCommand is the setting of CS20 in TCCR2B. According, to the datasheet, setting CS20 corresponds to a prescaler of 1:

How to set the prescaler for Timer 2

Now let’s look at the mark function:

void mark(int delayUs) {
TCCR2A |= (1<<COM2B1);
delayMicroseconds(delayUs);
space(waitLength);
}

If you look in the datasheet, you’ll see that the COM2B1 bit enables the B output for Timer 2 (the B output is pin 3). We OR TCCR2A with this bit to set it equal to one then let it run for either 600 or 1200 microseconds depending on whether we want to transmit a 1 or a 0. Finally, we wait 600 microseconds every every pulse:

void space(int delayUs) {
TCCR2A &= ~(1<<COM2B1); // disable output on b
delayMicroseconds(delayUs);
}

This code is very similar to mark but we AND with the negation of COM2B1 to set that bit to 0.

I highly suggest looking at the ATmega328p datasheet if you want to gain a deeper understanding of the different WGM modes as well as how all of the different registers affect timing.

Putting it All Together

Now that we have written the code, it’s time to wire up the IR diode. Once, completed the setup should look like this:

Fritzing sketch for transmitting IR commands

I couldn’t find an IR component in the Fritzing application so make sure you’re not using a red LED for this circuit. I also suggest putting the diode towards the back of the breadboard so you can tilt it up since it is crucial the diode points at the TV.

It’s even better if you have some DuPont male-female wires so you can point the diode without having to tilt the whole breadboard.

Now that everything’s setup, you can upload the code onto the Arduino. Then open up the serial monitor and enter some commands such as 21 for power up or 16 for channel up.

If the TV is responding, it could be due to any number of things: 1) the diode is connected backwards, 2) it is not pointed at the TV’s receiver, or 3) the transmit code is not exact. The timer code is very sensitive to small changes. If for example just one bit is changed in TCCR2A/B, that could do something as drastic as changing the WGM.

If all else fails, you can just sendSony from Ken Schiriff’s IRremote library. I’ll cover how to use this library a little later.

Counting Claps

Now that we have the machinery in place to transmit the infrared commands, we need a way of listening to the environment and detecting/counting how many claps occur. Before we even think about doing that, we’ll need to hook up a microphone to the Arduino.

Using a Microphone

For a microphone, I am using a two-pin electret that I ripped out of an old landline. (I have a few MAX4466 three-pin electrets but have yet to solder the pins on).

Now, I’m not an electrical engineer (I am a computer science student) so I had some difficulty figuring out how to hook up the microphone properly. I scoured the web but most circuits included some sort of op-amp or transistor. I don’t have anything like that at my disposal, so I figured I’d just use the tried and true method of trial and error.

After a lot of trial and error, I settled on the following setup:

Circuit for reading from the microphone

I was kind of trying to emulate a RC high-pass filter which I read the clapper uses but I’m pretty sure the above is not exactly this (again, no electrical engineering education).

The resistor is critical in the circuit above while the capacitor seemed to just provide minor improvements. So if you don’t have one at your disposal, don’t fret.

If anyone with electrical engineering experience wants to school me on how the circuit should really look, that would be quite welcome.

Here is the breadboard sketch for the above circuit:

Breadboard diagram for reading from the microphone

Reading from the Microphone

Reading from the mic is as simple as performing an analogRead on pin A0 (or the analog pin you choose). If you have followed along throughout the previous sections of this tutorial, the following code should be pretty self-explanatory:

int micPin = A0;int micVal;void setup() {
Serial.begin(9600);

pinMode(micPin, INPUT);
}
void loop() {
micVal = analogRead(micPin);
Serial.println(micVal);
}

Detecting Claps

Now that we can detect sound pressure fairly reliably, we need a method of detecting claps. A clap, one would expect, has a fairly small extent in the time domain. Therefore, it would contain many frequencies, especially those higher than naturally found in the environment. This is just the nature of the Fourier transform.

If you don’t understand what I’m talking about, don’t worry because this isn’t the final method used.

I tried to buffer the incoming audio and perform the FFT on it for frequency analysis but the results were completely unreliable. This could be a result of either the library I was using, my microphone setup being incorrect, or both.

Therefore, we will need to come up with a method in the time domain of detecting claps.

Attack-Based Method

Let’s look at the waveforms of a clap versus a quick shout:

(Top) A recording of a clap. (Bottom) A recording of a shout

As you can see, both recording reach about the same max volume. The major difference is that the clap reaches that max astronomically quicker than the shout. This time that the recording takes to reach it’s max is known as the attack.

Let zoom in a little more just to emphasize how quicker the attack is on the clap:

A clap’s attack is much shorter than anything the human voice can produce

Attack-Based Detection on the Arduino

Now that we have a pretty solid method of identifying a clap versus other loud noises based on its attack, let’s implement it in code:

int micPin = A0;
int micVal;
int threshold = 620;
unsigned int nClaps = 0;
unsigned long lastClapEnd;
int clapTimeout = 500;
int clapMax;
unsigned long currTime;
unsigned long start;
int timeToClapMax;
int clapMaxThreshold = 630;
int maxAttackTime = 10;
void setup() {
Serial.begin(9600);
pinMode(micPin, INPUT);
}
void getClapMaxInfo() {
start = millis();
clapMax = micVal;
timeToClapMax = 0;
currTime = millis();
while (currTime-start < 100) {
micVal = analogRead(micPin);
if (micVal > clapMax) {
clapMax = micVal;
timeToClapMax = currTime-start;
}
currTime = millis();
}
}
void waitUntilBelowThreshold() {
micVal = analogRead(micPin);
while (micVal > threshold) {
delay(10);
micVal = analogRead(micPin);
}
}
boolean isClap() {
return clapMax > clapMaxThreshold && timeToClapMax <= maxAttackTime;
}
void loop() {
micVal = analogRead(micPin);

if (micVal > threshold) {
getClapMaxInfo();

if (isClap())
++nClaps;
waitUntilBelowThreshold();
lastClapEnd = millis();
} else {
if (nClaps > 0 && millis()-lastClapEnd >= clapTimeout) {
Serial.println(nClaps);
nClaps = 0;
}
}
}

Let’s break this down.

In loop, we read from pin A0 just as in the previous section. We then say that a loud noise has started if this value is greater than threshold. This threshold was chosen after some experimentation to be 620 (the mic normally outputs 1023/2 ~ 512).

After we determine a loud noise has started, we get both the value of the largest amplitude produced by this noise and the time it took to reach that value. This is done in getClapMaxInfo.

getClapMaxInfo reads from the mic for 100 milliseconds (chosen somewhat arbitrarily) and updates the clapMax and timeToClapMax every time a value is read that is greater than the current clapMax.

After we obtain the info on the attack, we use isClap to determine whether the 100ms sound we just heard was indeed a clap. We use two thresholds for this: one for clapMax and one for timeToClapMax. (Even though the value of the clap max is not as reliable as the time to the max, claps normally reach higher than other sounds so we check it).

If the clapMax is greater than 630 and the timeToClapMax was less than or equal to 10ms, then we say we just heard a clap!

We then use waitUntilBelowThreshold to keep reading from the mic until the value is below our initial threshold of 620.

We record the timestamp after this in the variable lastClapEnd because we need some way of determining when one multi-clap command versus multiple one-clap commands have occurred. We will do this with a timeout.

After reading a clap and returning to below the threshold, loop will re-execute. If we continue to be below threshold and we will enter the else clause.

Here, if we have claps read and the timeout has elapsed (500ms), then we print how many claps we counted and reset out clap counter.

Upload this code to the Arduino and you will see it works pretty well. You need to clap fairly quickly (each within half a second of each other) and pretty loud but the code does a good job of discerning between claps and other sounds.

If your setup is not working, play with the thresholds until it does. Your microphone could have different qualities which lead you to need lower/higher amplitude thresholds or a shorter/longer attack threshold.

Putting it All Together

We are finally able to create the full remote. You can do this by basically combining the circuits for the IR transmission and mic reading sections:

Final sketch for the remote

Sorry for the ugly diagram but I’m not too good at the Fritzing application.

Now the code is basically the clap detection described in the previous section but with a few extra lines for command transmission.

For transmission, we will actually use the IRremote library so that the code is sorter and easier to understand:

#include <IRremote.h>const int numCommands = 5;
int nClaps2Command[numCommands] = {0x90, 0x890, 0xa90, 0x490, 0xc90};
int times2Repeat[numCommands] = {1, 1, 1, 5, 5};
IRsend irsend;... other global variables

We include the IRremote library and initialize an irsend object so we will be able to perform the transmission that we previously did manually.

We also create two arrays. nClaps2Command is a mapping from the number of claps detected to a command to send. 1 -> channel up, 2 -> channel down, 3 -> power, 4 -> volume up, and 5 -> volume down. This mapping was chosen with what commands will be used most frequently in mind.

It took me longer than I’d care to admit to decipher the form of the commands accepted by IRremote. However, once I put the hex values into a hex-to-binary converter it instantly became clear. 0x90 = 0000 1001 0000. Which is 16 in little-endian form (the decimal command for channel up) followed by 1 in little-endian, the five-bit address for the TV.

We also have times2Repeat to map number of claps to number of times to execute each command. The only purpose of this is so that when volume up or down is detected, we execute it five times because it would be annoying to constantly have to clap four/five times for just a change of one in volume.

Our simple transmit function now looks like:

void transmit(int command, int repeat) {
for (char n=0; n<repeat; n++) {
for (char i=0; i<3; i++) {
irsend.sendSony(command, 12);
delay(40);
}
}
}

Where the second parameter to sendSony is basically which version of the SIRC protocol (i.e. the 12-bit version).

There are now just a few changes we need to make to loop to execute the commands detected:

void loop() {
...
if (isClap()) {
++nClaps;
if (nClaps > numCommands)
nClaps = 0;
}
...
} else {
if (nClaps > 0 && millis()-lastClapEnd >= clapTimeout) {
...
transmit(nClaps2Command[nClaps-1], times2Repeat[nClaps-1]);
}
}
}

When we hear a clap, we reset nClaps is we have read more than numCommands so that we don’t access memory outside of the array.

When our clap sequence times out, we call transmit and simply pass the command and times to repeat by indexing into our two arrays.

Upload this code, point the IR diode at your TV, and clap fairly close to the microphone and you should be able to control your TV with just claps!

Here is a picture of what my final setup looks like:

Final remote circuit. The two M-F wires at the top are attached to the IR diode.

Conclusion

I hope you learned something about the Arduino, how TV remotes work, and how to deal with audio data. I know I certainly did.

You can also easily change the mapping from number of claps to commands or add more commands by modifying nClaps2Command and times2Repeat.

Thanks for reading and I hope you enjoyed it!

Computer science enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store