Subcommittee on Data
Gregory Slater, Director of Planning and Preliminary Engineering, Maryland State Highway Administration
Data Subcommittee Efforts
Principle 1 - VALUABLE:
Data is an asset—Data is a core business asset that has value and is managed accordingly.
Principle 2 - AVAILABLE:
Data is open, accessible, transparent and shared—Access to data is critical to performing duties and functions, data must be open and usable for diverse applications and open to all.
Principle 3 - RELIABLE:
Data quality and extent is fit for a variety of applications—Data quality is acceptable and meets the needs for which it is intended.
Principle 4 - AUTHORIZED:
Data is secure and compliant with regulations—Data is trustworthy and is safeguarded from unauthorized access, whether malicious, fraudulent or erroneous
Principle 5 - CLEAR:
There is a common vocabulary and data definition—Data dictionaries are developed and metadata established to maximize consistency and transparency of data across systems.
Principle 6 - EFFICIENT:
Data is not duplicated—Data is collected once and used many times for many purposes.
Principle 7 - ACCOUNTABLE:
Decisions maximize the benefit of data—Timely, relevant, high quality data are essential to maximize the utility of data for decision making.
For a more in depth look at each principle, keep reading...
Data is an asset
Rationale—Data is a core industry asset that has measureable value and is managed accordingly. Accurate, timely data is critical to accurate, timely decisions. Transportation agencies already manage many of their physical assets: roads, bridges, signs, lights, etc. Data is no different and must be treated like other physical assets. Data is the foundation of our decision-making, so we must also carefully manage and maintain data to ensure that we know what we have and where it is, can rely upon its accuracy, and can obtain it when and where we need it. Where possible, data should be archived to maintain historical records.
Implications—Treating data as the asset that it is saves money, effort and resources. When data is appropriately handled it can have a long life with many uses beyond its original one, and serve projects as yet unplanned.
Data is open, accessible, transparent, and shared
Rationale—The value of data is increased when it can be used with other data and in a variety of applications. Users must have access to the data critical to their duties and functions. Wide access to data leads to efficiency and effectiveness in decision-making, and affords timely response to information requests. Using data must be considered from an enterprise perspective (across the organizations or across multiple organizations) to allow access by a wide variety of users. Transportation agencies at all levels of government (federal to state to local) hold a wealth of diverse data sets, but it is often stored in different databases that are incompatible with each other or difficult to find. Timely access to accurate data is essential to improving the quality and efficiency of decision-making. It is less costly to maintain timely, accurate data and then share it, than it is to maintain duplicative data in multiple locations or processes. Shared data will result in improved decisions since we will rely on fewer sources of more accurate and timely managed data for decision-making. Sharing is also necessary to triangulate on subjects that may not be measured directly, and allows for serendipity. Insights often come from bringing fresh eyes to data.
As transportation organizations work with more stakeholders and external partners, it is essential that data be shared. Making data electronically available will result in increased efficiency when existing data entities can be re-used. It is more effective to de-protect transportation data than it is to over-protect.
Implications—Agencies are increasingly under in informal mandate to "do more with less." Sharing data is a key step in executing this mandate. Accessible data will ultimately reduce burden on staff time as data becomes more accessible.
Data quality is fit for purpose
Rationale—Data quality is acceptable and meets the need for which it is intended. Data that is collected, produced, and reported must be fit for purpose. That is, of sufficient accuracy and integrity proportional to its use and cost of collection and maintenance. Data is used in all areas of the transportation decision-making process from planning to design to operations to performance management. Furthermore, it is increasingly being used externally by citizens and customers to inform their personal decisions, and by stakeholders to assess the aggregate performance of a transportation organization. Significant human and system resource is consumed in the collection, manipulation and dissemination of data whether of high quality or not, so it is essential that the most effective use of public funds is achieved through appropriately directed attention to data quality and the procedures to realize quality. Additionally, data must be archived appropriately to preserve both its usefulness and the historical record. When possible, data should be spatially oriented. Data quality increases as the application of the data increases. Data that has spatial orientation or attribution can easily be used in GIS systems. When data assets can be analyzed in a spatial context, not only can a greater analysis be completed in terms of geographic context, but also the data and any analysis results can be more easily communicated via mapping and other formats more applicable to public understanding.
Implications—When data is fit for purpose appropriate cost decisions are made in its collection and use. In cases where a rough sketch is appropriate, appropriate data collection and use may follow. Where large programs, investments, or systems are being developed and vetted, those data must be fit for that purpose. Data precision is matched to the task at hand.
Data is secure and compliant with regulations
Rationale—Data is trustworthy and is safeguarded from unauthorized access, whether malicious, fraudulent or erroneous. Open sharing of information and the release of information via relevant agreement must be balanced against the need to restrict the availability of classified, proprietary, and sensitive information.
Implications—When data is secure and appropriately regulated there is greater trust and confidence in its use.
There is a common vocabulary and data definition
Rational—Both unstructured and structured data must have a common definition to enable sharing of data. However, data must not be compromised below the use of its original purpose. Commonality may take the form of relations, bridges and crosswalks between definitions
Implications—A common vocabulary will facilitate communications, enable dialogue to be effective and facilitate interoperability of systems, however, utility must not be compromised.
Data is not duplicated
Rationale— Development of information services should be made available to multiple users and stakeholders and is preferred over the development of information and data silos which are only used for a single purpose or user.
Implications—Duplicative capability is expensive and propagates conflicting data. It also goes against a policy of sustainability in the use of data and the infrastructure resources required to maintain the data, such as computer servers and data warehouses.
Decisions maximize the benefit of data
Rationale—The purpose of data collection is to help support the decision-making process. Users of the data, as well as information derived from the data, are the key stakeholders in the data collection and analysis process. The data is being collected to address a certain policy goal or objective. In order to ensure information management is aligned with the purpose, users must be involved in the different aspects of the information environment. The decision makers, managers, and the technical staff responsible for developing and sustaining the information environment need to come together as a team to jointly define the goals and objectives of the data collection processes.
Implications—Resources are limited. Maximizing existing resources is essential.
A scoping study is underway under the auspices of the Transportation Research Board National Cooperative Highway Research Program quick response tasks (NCHRP 8 - 36). The task (129) Scoping Study to Establish Standards and Guidance for Data for Transportation Planning and Traffic Operations Purposes was selected in May, 2013. The work is expected to commence in late 2014.