S6 Data standards
Types of data format which could be used as standard
There are a variety of different data standards that exist which are used to provide different types of travel data or to provide data in a particular way to a data processor for data consumers. Data formats are established standards for data exchange which individual organisations can follow to provide their data to another person. Some data formats are more commonly known by their acronym, for example the National Public Transport Access Nodes standard, a UK Government standard for bus stops, railway stations ferry ports, is commonly known as simply ‘NaPTAN’ Data.
Within bus service information, the most common standards in operation are TransXChange (“TXC”) for timetables, Network Timetable Exchange format (“NeTEx”) for fares, Service Interface for Real Time Information (SIRI) for real time data, and NaPTAN for bus stop information. There is no single data format, which currently exists that can provide timetable information, fares information, real time data and bus stop information.
There are seven alternate data formats that can be used to provide information about the times of services and stopping places in addition to TransXChange. These are, ATCO CIF, JESS HASTUS, NeTEx, and SIRI. There is a single standard for bus stop location and facilities data, the NaPTAN data format. Fares information can be provided in NeTEx and GTFS standard, while real time location and real timetable information can be supplied via SIRI or GTFS. Real time onboard information about accessibility or capacity can use the SIRI standard.
While the intention is to prescribe the required standard in legislation, Transport Scotland also intends to keep the development of new and existing standards under review, and this may mean updating legislation to specify a different standard in the future, if that best serves the needs of passengers and operators.
Routes, stopping places, and timetables data
Route and timetable data for bus services in Scotland is already published openly through the Traveline National Data Set. However, that dataset is based on the requirements of bus registration and is not always detailed enough to use in journey planners. For example, bus operators are required to provide a proposed timetable indicating proposed times at principal points on the route to register a service, while journey planning requires every bus stop on the route to be included.
We propose to require operators to provide in a digital format a timetable with information provided at bus stop level, and route data provided as a list of points, which allow the route to be drawn with sufficient detail to follow the associated road geometry (as opposed to presuming a route between bus stops).
Data submitted to Traveline Scotland is currently provided in a variety of formats (although most commonly TransXChange 2.1). As data is received in a number of formats it requires further processing by the Traveline Scotland open data hub system to enable them to provide the data in a consistent format. Additionally, legacy formats are still in use which do not include as many data fields as modern formats. This limits the quality of the data that can be provided, for example legacy formats often do not include accessibility information. In order to align with standards in England and Wales, it is likely that we would require the data in the TransXChange format. We propose that operators would fulfil their statutory requirement to provide open data by submitting their timetable data to the open data hub (Traveline Scotland) in the TransXChange format.
Fares and ticket data
As sharing fares information is not currently mandatory in Scotland, only certain operators currently share this information. The Traveline Scotland open data hub system also converts a small number of proprietary data formats from Electronic Ticket Machines to an in-house standard for use in the passenger information system. This is not an automated process, and while the fares data that is currently supplied is of good quality, it is not in a consistent format which creates inefficiency in data transformation. The Traveline Scotland open data hub is currently adopting NeTEx for fares data in Scotland and we expect from discussions with industry that operators will begin to voluntarily provide fares data in this standard, however will look to bring in a standard to ensure consistency.
With the introduction of the Bus Open Data Regulations, operators will soon have a statutory responsibility to make fares information available in an open format, in the open data hub. Third parties may assist operators with complying with this duty, but compliance will be the responsibility of operators.
We propose to require data to be submitted in the NeTEx standard. This would align with the DfT’s approach, providing cross-border consistency. We also propose that fares information be phased in over time, with simple ticket information brought in first, followed by complex ticket information, in order to give bus operators a reasonable period to put the necessary systems in place.
Real time data
Real time information is needed for passengers to be updated on the live status of a bus service. This increases confidence of using public transport. Real time information is proposed to include information on vehicle locations, bus stop arrival and departure times, live timetables, and on-bus information including capacity and wheelchair space utilisation.
SIRI is the most common data standard currently used for Real Time Passenger Information systems in the UK. There are various types of SIRI data which deliver different real time information.
A mixture of Regional Transport Partnerships, local authorities and operators currently coordinate the real time information for bus operators in their areas and feed this to the Traveline Scotland open data hub. While coverage is increasing, real time information is not currently available for every bus service in Scotland. Additionally, quality issues can also arise when scheduled data used in the Traveline Scotland journey planner is different to the schedule data being reported by real-time systems. This can occur when changes to bus services take place at short notice.
There are two options for real time data to be provided in, based on current processes for those operators providing real time data. These are:
SIRI: SIRI data (specifically SIRI-SM, SIRI-VM, SIRI-ET and SIRI-SX) could be used to provide real-time arrivals, departures and timetables, vehicle positions, accessibility of the bus and emissions data and real time service alerts.
GTFS: Alternatively, GTFS Realtime could be used which contains trip updates, vehicle positions, and service alerts.
Having reviewed the two options, we understand that SIRI collects higher resolution data than GTFS, and allows for data to be converted into GTFS for use data consumers. We believe that real time data should be provided in the SIRI data standard to collect more raw data, and to align with English and Welsh standards.
Bus stop location and facilities data
Effective journey planning requires detailed information on the location and accessibility of bus stops. There is currently a single standard for this information, used by local authorities or a body appointed to undertake this function for them. This is known as the NaPTAN data format required for submitting information into the NaPTAN database (currently NaPTAN 2.5).
NaPTAN is the UK National Public Transport Access Nodes dataset. It describes the precise location of stops, stations and ports for all public transport modes. NaPTAN is the foundation of most scheduling and journey planning systems. NaPTAN works alongside a second dataset, the National Public Transport Gazetteer (NPTG) which is a topographic database of all cities, towns and settlements in the UK, providing a frame of reference for the NaPTAN database
Local authorities input and maintain this stop data for bus stops and stations. The DfT maintains other stop types centrally (Metro, ferry ports, rail stations etc). In Scotland, Local Authorities maintain this data, however some authorities have arrangements with other bodies, for example, SPT manages this data for the authorities within its boundaries. Traveline Scotland directly manages the NaPTAN for Western Isles and the Shetland Islands.
Because there is no statutory requirement to keep NaPTAN data up to date, it relies on the resourcing and prioritisation of the work at authority level, and therefore, quality across Scotland is variable.
It appears sensible to use the NaPTAN standard for bus stop data in Scotland. We therefore propose requiret on local authorities to maintain the data relevant to bus services in a format which supports the NaPTAN and NPTG datasets. In England, where this information is already mandatory, the data must be submitted in XML format, may not be a zipped file, and must be a file size of less than 128MB, however there are no restrictions on naming conventions.