In late 2013 I began seriously putting together a handful of Raspberry Pi projects. Here’s a detailed writeup of the most promising of them: using a Pi as a car black box. The Pi can handle multiple sensors, store that data, and get it uploaded to a central cloud storage service.
The database and Big Data lover in me wants data, lots of it. So I’ve gone with building a black box for my car that runs all the time the car is on, and logs as much data as I can capture. This includes:
- and more
Once you’ve got a daemon running, and the inputs are being saved then the rest is all just inputs. Doesn’t matter what it is. It’s just input data.
My initial goal is to build a blackbox that constantly logs OBD2 data and stores it to a database. Looking around at what’s out there for OBD2 software, I don’t see anything that’s built for long term logging. All the software out there is meant for 2 use cases: 1)live monitoring 2)tuning the ECU to get more power out of the car. What I want is a 3rd use case: long term logging of all available OBD2 data to a database for analysis.
In order to store all this data I decided to build an OBD2 storage architecture that’s comprised of
- MySQL database
- JSON + REST web services API
- SDK that existing OBD2 software would use to store the data it’s capturing
- Wrapping up existing open source OBD2 capture data so it runs as a daemon on the Pi
- Logging data to a local storage buffer, which then gets synced to the aforementioned cloud storage service when there’s an internet connection.
Right now I’m just doing this for myself. But I’m also reaching out to developers of OBD2 software to gauge interest in adding this storage service to their work. In addition to the storage, an API can be added for reading back the data such as pulling DTS (error) codes, getting trends and summary data, and more.
The first SDK I wrote was in Python. It’s available on GitHub. It includes API calls to register an email address to get an API key. After that, there are some simple logging functions to save a single PID (OBD2 data point such as RPM or engine temp). Since this has to run without an internet connection I’ve implemented a buffer. The SDK writes to a buffer in local storage and when there’s any internet connection a background sync daemon pulls data off the buffer, sends it to the API and removes the item from the buffer. Since this is all JSON data and very simple key:value data I’ve gone with a NoSQL approach and used MongoDB for the buffer.
The API is built in PHP and runs on a standard Linux VPS in apache. At this point the entire stack has been built. The code’s nowhere near production-ready and is missing some features, but it works enough to demo. I’ve built a test utility that simulates a client car logging 10 times/second. Each time it’s logging 10 different PIDs. This all builds up in the local buffer and the sync script then clears it out and uploads it to the API. With this estimate, the client generates 100 data points per second. For a car being driven an average of 101 minutes per day, that’s 606,000 data points per day.
The volume of data will add up fast. For starters, the main database table I’m using stores all the PIDs as strings and stores each one as a separate record. In the future, I’ll evaluate pivoting this data so that each PID has it’s own field (and appropriate data type) in a table. We’ll see which method proves more efficient and easier to query. The OBD2 spec lists all the possible PIDs. Car manufacturers aren’t required to use them all, and they can add in their own proprietary ones too. Hence my ambivalence for now about creating a logging table that contains a field for each PID. If most of the fields are empty, that’s a lot of wasted storage.
Systems integration is much more of a factor in this project than coding each underlying piece. Each underlying piece, from what I’ve found, has already been coded somewhere by some enthusiast. The open source Python code already exists for reading OBD2 data. That solves a major coding headache and makes it easier to plug my SDK into it.
There are some useful smartphone apps that can connect to a Bluetooth OBD2 reader to pull the data. Even if they were to use my SDK, it’s still not an ideal solution for logging. In order to log this data, you need a dedicated device that’s always on when the car’s on and always logging. Using a smartphone can get you most of the way there, but there’ll be gaps. That’s why I’m focusing on using my Pi as a blackbox for this purpose.