Banner by Tanvi Akunuri

Me, My Family and Alexa: Living in Synergy with Technology

by Kolappa Pillai

It’s 7:00 AM. You are awake bright and early on a chilly December day. Your Alexa is playing your favorite genre on the radio station. You’ve set your Ring doorbell camera to notify you of any movement outside your door. You scroll through the social media of choice. You come downstairs and your Keurig K-Supreme Plus SMART has already made you a French roast, one of your 3 “hearted” flavors on the app. You sit down and your Samsung TV turns on without you using your remote. “TV, play Stranger Things,” you yell across the room. You’re preheating your car from the living room and sending directions to your CarPlay. While some may see this as living in a futuristic utopia envisioned by our predecessors from decades past, we need to analyze this situation in a more critical light. Where is the intelligence from all this “smart” machinery coming from? Where is the data being processed? Who is processing it? Are you a psycho for overanalyzing all this? These questions are all completely normal.

So, where do you even begin to figure out the world around you? We need to first investigate the IOT, or Internet of Things. According to IBM, the Internet of Things is “a network of physical devices, vehicles, appliances, and other physical objects that are embedded with sensors, software, and network connectivity, allowing them to collect and share data.” In simpler terms, the IOT is a network of devices that have internet access and can communicate with each other, directly or indirectly..

The first part of the definition, or the network of devices and appliances, refers to most every “smart” electronic you'll see today: Apple watches, many fridges, most modern cars, and your phone are all connected to the internet. Under the guise of convenience, all of these technologies embed themselves solidly within your home, integrating into the physical and digital network. That also covers the second part of the definition: the embedded sensors, software, and network connectivity. These devices offer some benefit to your life that requires them to have internet access. This could be live biometric data (like a watch or pacemaker), remote access (like appliances ranging from your thermostat to your coffee maker), or even something as obvious as allowing literal internet access (such as smart fridges or TVs). The major problem with most of these hyper-convenient devices is that they require this factor, it isn’t an optional add-on for most devices.

The final point is the nail in the coffin. What does allowing them to collect and share data actually mean? Surely, it can’t imply total panopticon-esque surveillance. Right? Let’s start with every conspiracy theorist’s holy grail, the AI Home Assistant. Any Google Home or Alexa style product released within the last decade has faced the same scrutiny from the public — “I don’t want a robot in my house collecting my data! It listens to every word you say! How else could it know when to respond?”.

Yelling out for Alexa or Google may feel like you’re signing your privacy away to a corporation, but there’s a lot more going on under the hood. The technology powering all of these yell-to-wake devices is aptly named wake-word detection, or alternatively, keyword spotting. Unlike some interactions where every word is actively recorded and processed (like text-to-speech or when actively talking to Alexa, Siri, Google Home, etc.), wake-word detection is designed to tell the device when to start recording.

How do these wake-words work?

Step 1 of the wake-word development cycle is the data collection phase. This involves collecting thousands upon thousands of variations of the wake word itself. For example, “Alexa” could be said in many different ways. An outback-dwelling Australian farmhand and a Chinese-American will pronounce the word in completely different ways. This is why these companies train their software on so many audio clips. However, there can be no positive without negative, and aptly so, negative examples are used in the training as well. “A lesser,” “Alex,” or “A lack of”? None of these words are very likely to ring up a false positive when said in a normal conversation because they are included in this negative training set. In addition to these similar words, other words are used as well. “Electric”, “Sticker”, and thousands of other random words are fed into the machine to create a disparity between “Alexa” and everything else.

Step 2 is the processing of this data. Since computers can’t directly understand what “Alexa” means, numbers must be used to help them understand. A common method is known as MFCC, or Mel-Frequency Cepstral Coefficients. These MFCCs describe how sound energy is distributed across frequencies. It works by cutting up the audio into very small intervals, usually around 10 milliseconds. It then converts each segment into a frequency spectrum and compresses that spectrum into a small set of numbers. For example, instead of seeing a sound bite, it would see something like shown below:

time step features

———————————————————

t1 [0.32, -1.4, 0.85 ...]

t2 [0.30, -1.3, 0.81 ...]

t3 [0.29, -1.2, 0.78 ...]

With this “fingerprint,” the AI is trained to recognize the wake-word with a supervised learning model and can output a probability of any spoken word being the chosen wake-word. With that returned probability, ranging from 0 to 1, the machine will then decide whether or not to classify the spoken word as a valid ringer to activate. This program is running constantly, many times per second. Through TinyML, which is the application of machine learning on tiny microcontrollers, parts that can utilize this software are embedded directly on many smart speakers’ internal parts. A helpful analogy for this full process could be like hearing your name in a crowded party. You’ll snap to a state of alertness. Even though you couldn’t name a single topic of discussion in that given crowd, you’ll hear your name most clearly.

While the Alexa isn’t recording every millisecond of audio, some bad experiences may rapidly sour the atmosphere of the IOT discussion. In Portland, Oregon in 2018, a woman, only publicly identified as “Danielle,” claimed that her family was being recorded by their Amazon Echo. This was after an employee received a small audio file of Danielle and her husband talking. Amazon later confirmed this to be true. Their explanation was that a series of unlikely events caused this. A word similar enough to “Alexa” triggered the wake-word protocol and then misinterpreted parts of her conversation to be a request to send a message to someone.

So what’s the point of all of this explanation followed by a seemingly contradictory backtrack? The point of this is not to bridge this to some reptile invasion theory, but to note that the best way to remain vigilant and protective of your data while not falling into crazy conspiracies and misguided misinformation is to be educated on all sides of the spectrum. While you may not be quizzed on how a smart home assistant knows its own name anytime soon, knowing how it collects data on you can be the deciding factor for a $100 purchase. When you’re smart enough to recognize this as a mistake, you can also see the implications that this incident has. What’s to then stop a home assistant from misinterpreting a second request, unintentionally sending more sensitive information to external parties?

Smart home assistants are part of a larger web, one that is knotting up your entire home. Microphones, cameras, and sensors are all becoming more and more normalized. And while the corporations creating these devices may want you to believe these are iron-clad privacy systems protecting you, you can see how they may falter at the weakest links.

Trekking back to the original idea of the IOT, the basic concept of this is how every small endpoint is exactly that, a network endpoint. What this means is that each of your smart-devices connected to your home WiFi has its own faults; it has software, its own network login, communication back to servers, and internal firmware (software dictating how the physical parts of the device function) that might not be updated. All of these contain their own vulnerabilities. The Cybersecurity and Infrastructure Security Agency, or CISA for short, notes that the IOT increases your cyber risk through the sheer amount of interconnectedness.

While your iPhone or Laptop may be bugging you with the amount of security updates it wants you to incorporate, you don’t usually think about how a nanny-cam or thermostat might need its own security at all. Many of these devices ship with very basic security, meaning default login info, many being “admin” as the username and password as listed by reports of exposed passwords during data breaches. Hackers can then use this information to peep through cameras, intercept incoming network traffic.

Still, the problem only escalates from here. The main problem isn’t just that the device is seeing or hearing you, but the fact of where this data is going.While your nanny cam might have an SD card connected to it, we all know that’s not where all 4 years of your recordings are being held. Locally these machines might only have a few hours or days of recordings, while the majority is being sent to a server managed by the company you bought the device from. This stored data is stored in dedicated centers and can be turned over to the authorities at the behest of officials managing high profile investigations.

And while you may be okay with certain information being used, the government has been involved in shadier business. Pivoting a bit away from traditional tech, the company GEDmatch was in a bit of controversy a while back. During the investigation of the Golden State Killer, the website was contacted for DNA information. What the company provided as a service was the ability to input your DNA information gathered from Ancestry.com, 23&me, etc. and would let you “connect” with family members, or those closest to you who also chose to upload their DNA.

The problem with this big family tree was that it would allow law enforcement to track down entire families of people who hadn’t consented. They used his DNA to find a close match and ID other family members of his time to finally conclude with the finding of the identity of the GSK, a police officer from the time known as Joseph James DeAngelo. While the use of this information may have been okay, it serves the same purpose as the Alexa story from before. It shows the potential of this technology to violate moral gray areas when needed.

More appropriate to our other examples, Ring, the home security service very often is involved in these legal situations. When legally required, Ring responds to subpoenas, court orders and warrants often. This is all fine and dandy, but the problem comes with the environment it creates. Ring has access to most angles of most neighborhoods. Ring was also alleged by the FTC to allow employees and contractors access to private videos, and failing to implement basic protection of people’s data. This led to a big settlement made towards customers for allowing hackers to access quite a few cameras.

In more recent news, the video game company, Niantic, has been building a large geospatial model of the Earth using scans made with the help of players of one of their more popular games. One you may know as Pokemon Go. Once again, the terms and conditions come back to bite consumers, as many who haven’t read them or understood the implications have allowed a company to access data of their surroundings, while tracking their every move. Similarly, in January of 2025, the FTC took action against General Motors for allegedly using and collecting data on millions of vehicles tracking location data and driving behavior, a case that took a year to reach a settlement.

This also isn’t the only recent fiasco the FTC has attempted to settle either. InMarket, a marketing company; Mobilewalla, a data broker company; and Gravy Analytics and Venntel, both of which are human data analysis companies were all charged by the FTC and had strict rules imposed on them, some of them banned from selling data, others banned from buying data, all of which happened between 2024 and 2026. This is an increasingly common trend between these large corporations, designing their products and business dealings to maximize data collection and remove a layer of privacy between them and the consumer, making them easier to compress into data and analyze.

While Ring outright states their involvement with law enforcement on their website, and customers know what they are signing up for, there is still a looming sense of unease many may feel. If you knew that there was a chance someone was going to need to scroll through your entire camera feed history, is anything you do really private under the watchful eye of the camera? At what point does this safety net become a suffocating net-trap? Your data is no longer limited to what google searches and is now extending to your every move and your DNA code, your home security and anywhere you visit, all at the fingertips of these companies.