Nowadays, it is very common to see automatic toll booths on highways that automatically detect the movement of a vehicle through the toll for billing purposes. Automatic tolls can also serve other valuable purposes like alerting police when a suspicious vehicle crosses the toll booth and monitoring traffic patterns. An automatic toll system consists of data capture and an alerting application at each toll point and a centralized system for generating billing information. Due to the high volume and reliability requirements of data capture at each toll booth and the extremely low latency data capture requirements, it is often necessary to have a local database repository at each toll booth, in addition to the centralized repository. In the rest of this article, we will focus on the characteristics and requirements of the repository that manages data in each toll booth.
There are two common ways in which the automatic toll system can identify a vehicle that is crossing the toll. The vehicle might have a transponder (issued by the highway authorities) installed in the vehicle. As the vehicle crosses the toll point, sensors at the toll booth detect the signal from the transponder which serves to identify the vehicle. The other way to identify a vehicle is to use a high-speed camera to take a picture of the license plate of the vehicle as it is crossing the toll, and then use optical character recognition (OCR) on that image to identify the vehicle. For law enforcement purposes, toll booths most often have cameras that take a picture of every vehicle’s license plate as it crosses the toll.
Once the vehicle has been identified, a record of the vehicle toll crossing (this might include information such as vehicle ID, timestamp, license plate image, ID of the toll booth, how much toll was charged) is inserted into the toll booth’s local database. Periodically, this data is sent to a central repository which processes the data for billing purposes. In addition, the toll booth system typically verifies the vehicle against a “suspicious vehicle” database to determine whether law enforcement authorities need to be alerted about the location of the vehicle. Obviously, such an alert needs to be delivered very quickly, before the vehicle has traveled far away from the toll booth.
Information about each vehicle’s toll crossing needs to be retained for a fixed period of time (e.g. three months) after which the data can be either discarded or moved to a different repository for long-term archival purposes.
From the above description of the functional requirements of automatic toll booth system, it is possible to infer the key requirements of the data management repository used. The repository should be highly reliable and highly available. Malfunction of an automatic toll booth can quickly cause traffic jams, especially during peak traffic hours. It should support very high throughput data capture including large objects like images. It should support very low latency lookup of the “suspicious vehicle” repository. It should be able to manage hundreds of gigabytes of data without performance degradation, and it should provide features to move (or delete) data past its retention requirements. Finally, the automatic toll booth system must be completely self-managing and should not require a manual operator to administer the system on a regular basis (no DBA required!). The underlying data management repository must provide data manipulation capabilities as well as APIs and utilities that enable self-administration.
Berkeley DB is an embeddable database system that is ideally suited for addressing these requirements, with APIs for data manipulation as well as database administration.
Berkeley DB is designed to manage hundreds of gigabytes of data and hundreds of millions of records efficiently and reliably. Berkeley DB’s transactional capabilities support reliable data capture at rates of tens of thousands of records per second. Berkeley DB provides concurrent, extremely low latency access to data – a single record can be retrieved in less than 10 milliseconds on a commodity hardware platform.
Besides key-value based APIs like get_record(), put_record( ), Berkeley DB also supports a standard SQL API for manipulating data, which can be a significant convenience for the application developer. Berkeley DB also provides a wide range of indexing options including b-tree indexes, hash indexes, queue, recno and heap access methods for efficient data access and management. Further, Berkeley DB supports the notion of data partitions which can be very useful for addressing the data retention requirements. For example, the automatic toll booth application could create one data partition per month of information. “Old” data records that do not need to be retained any longer can easily be moved or deleted by deleting or moving the data partition for the data.
Berkeley DB provides APIs and utilities for administering the database and performing maintenance operations like log archiving, checkpointing, transactional recovery on application re-start, index compaction and so on. These utilities and APIs enable the application developer to build a compact, high performance and self-managing application that can be deployed at each toll booth in a cost-effective manner.
Berkeley DB is a mature, robust, database solution that has been used in a wide variety of data management applications, with over 200 million production deployments. Berkeley DB customers appreciate the reliability, performance, ease of use and flexibility of Berkeley DB as well as the ability to get commercial support for their applications. Berkeley DB is a solution you can rely on, for a wide variety of high performance, high availability, enterprise database applications.
Please see www.oracle.com/technetwork/database/berkeleydb/learnmore/bdbtollboothexample-2508507.zip for a sample automatic toll booth application that illustrates many of the relevant features of Berkeley DB discussed in this paper.