Applications of big data to smart cities
© Al Nuaimi et al. 2015
Received: 15 May 2015
Accepted: 5 November 2015
Published: 1 December 2015
Many governments are considering adopting the smart city concept in their cities and implementing big data applications that support smart city components to reach the required level of sustainability and improve the living standards. Smart cities utilize multiple technologies to improve the performance of health, transportation, energy, education, and water services leading to higher levels of comfort of their citizens. This involves reducing costs and resource consumption in addition to more effectively and actively engaging with their citizens. One of the recent technologies that has a huge potential to enhance smart city services is big data analytics. As digitization has become an integral part of everyday life, data collection has resulted in the accumulation of huge amounts of data that can be used in various beneficial application domains. Effective analysis and utilization of big data is a key factor for success in many business and service domains, including the smart city domain. This paper reviews the applications of big data to support smart cities. It discusses and compares different definitions of the smart city and big data and explores the opportunities, challenges and benefits of incorporating big data applications for smart cities. In addition it attempts to identify the requirements that support the implementation of big data applications for smart city services. The review reveals that several opportunities are available for utilizing big data in smart cities; however, there are still many issues and challenges to be addressed to achieve better utilization of this technology.
Current work and research projects in this field have generated some literature that highlighted the importance of big data in supporting smart city applications and services. In addition, some work investigated some of the issues of utilizing big data in smart cities [3–6]. The main contribution of this paper is reviewing the application of big data in smart city and exploring the opportunities and challenges for utilizing big data in smart city. In addition, the paper investigates the general requirements for the design and implementation of big data based applications for smart city applications and services.
This paper will first, in Section 2, introduce the concepts of a smart city, big data, and applications of big data in a smart city. We will also investigate the current definitions of these concepts available in the literature and we will compare them. In Section 3 we will discuss the benefits and opportunities of smart cities, big data, and their applications and in Section 4 we will identify the challenges of using big data for smart city applications and services. We will then move to offering an overview of the general requirements to implement smart city applications based on big data in Section 5. In Section 6 we will discuss and illustrate some open issues that may help other researchers start their research in the field and in Section 7 we will conclude the paper.
Definitions of smart city and the differences and similarities between them
Definition of Smart City Concept
Area of Focus
“Smart city is a very broad concept, which includes not only physical infrastructure but also human and social factors” .
Included the social aspects and agreed that smart city has a broad focus.
“The concept of Smart City (SC) as a means to enhance the life quality of citizen has been gaining increasing importance in the agendas of policy makers. However, a shared definition of SC is not available and it is hard to identify common global trends” .
Policy makers are an additional aspect of the smart city definition. Consents to the lack of a shared definition of smart cities.
“Smart city, the important strategy of IBM, mainly focuses on applying the next-generation information technology to all walks of life, embedding sensors and equipment to hospitals, power grids, railways, bridges, tunnels, roads, buildings, water systems, dams, oil and gas pipelines and other objects in every corner of the world, and forming the “Internet of Things” via the Internet” .
Address the technological aspect of smart cities and focuses on how next-generation information technology is the key.
“A city well performing in a forward-looking way in economy, people, governance, mobility, environment, and living, built on the smart combination of endowments and activities of self-decisive, independent, and aware citizens” .
Views a smart city as a futuristic model of collaborative components.
“A city that monitors and integrates conditions of all of its critical infrastructures, including roads, bridges, tunnels, rails, subways, airports, seaports, communications, water, power, even major buildings, can better optimize its resources, plan its preventive maintenance activities, and monitor security aspects while maximizing services to its citizens” .
Focuses on the integration of infrastructure and systems that monitor and control the resources to achieve sustainability as the main aspect of a smart city.
“Connecting the physical infrastructure, the IT infrastructure, the social infrastructure, and the business infrastructure to leverage the collective intelligence of the city” .
A more generic view that puts together all main aspects of a smart city to achieve the goal. Seems to be most comprehensive definition of a smart city.
“A city striving to make itself “smarter” (more efficient, sustainable, equitable, and livable)” .
General definition, does not specify how a city will get smarter.
A smart city is “. . . a city which invests in ICT enhanced governance and participatory processes to define appropriate public service and transportation investments that can ensure sustainable socio-economic development, enhanced quality-of-life, and intelligent management of natural resources” .
Views the smart city as specific, and narrow, set of resources/services working together to achieve a better life.
From the offered definitions we can view the smart city as an integrated living solution that links many life aspects such as power, transportation, and buildings in a smart and efficient manner to improve the quality of life for the citizens of such city. In addition the definitions also focus on the future by emphasizing the importance of sustainability of resources and applications for the future generations. We observed these aspects on each smart city proposal regardless of size, location and available resources. In general, governments around the world are mostly concerned about the cost of acquiring a smart city due to the varying financial abilities and the scarcity of resources, natural or human. The availability and size of such resources and their capabilities is one of the challenges of building and maintaining a smart city. Another challenge is the regulatory systems that could greatly affect the chances of success. To top all that there are also the technical challenges requiring highly advanced technological solutions. Conversely, new and emerging technologies can help transform such challenges into opportunities.
Four definitions of big data
SAS: “Big data is a popular term used to describe the exponential growth, availability, and use of information, both structured and unstructured” .
IBM: “Data, coming from everywhere; sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction record, and cell phone GPS signal to name a few” .
“Big Data is defined as large set of data that is very unstructured and disorganized” .
“Big data is a form of data that exceeds the processing capabilities of traditional database infrastructure or engines” .
Volume: refers to the size of data that has been created from all the sources.
Velocity: refers to the speed at which data is generated, stored, analyzed and processed. An emphasis is being put recently on supporting real-time big data analysis.
Variety: refers to the different types of data being generated. It is common now that most data is unstructured and cannot be easily categorized or tabulated.
Variability: refers to how the structure and meaning of data constantly changes especially when dealing with data generated from natural language analysis for example.
Value: refers to the possible advantage big data can offer a business based on good big data collection, management and analysis.
Others also mention a few more Vs of big data that cover some more aspects. For example volatility, which refers to the retention policy of the structured data implemented from different sources. Also there is validity that refers to the correctness, accuracy, and validation of the data. In addition there is veracity, which refers to the accuracy and truthfulness of the captured data and the meaningfulness of the results generated from the data for certain problems.
The various characteristics of big data demonstrate the huge potential for gains and advancements. The possibilities are endless; however, bounded by the available technologies and tools available. For big data to achieve its goals and advance services in smart cities, it needs the right tools and methods to be analyzed and classified effectively and efficiently. By understanding the available capabilities and limitations, we can capture many opportunities for better services and applications for smart cities using big data.
3 Benefits and opportunities
Efficient resource utilization: With many resources becoming either scarce or very expensive, it is important to integrate solutions to have better and more controlled utilization of these resources. Starting with technological systems such as Enterprise resource planning (ERP) and Geographic Information System (GIS)  will be useful. With monitoring systems at work, it will be easier to spot waste points and better distribute resources while controlling costs, and reducing energy and natural resources consumption. In addition, one of the important aspects of smart city applications is that they are designed for interconnectivity and data collections which can also facilitate better collaboration across applications and services.
Better quality of life: With better services, more efficient work and living models, and less waste (in time and resources), smart city citizens will have a better quality of life. This is the result of better planning of living/work spaces and locations, more efficient transportation systems, better and faster services, and the availability of enough information to make informed decision.
Higher levels of transparency and openness: The need for better management and control of the different smart city aspects and applications, will drive the interoperability and openness to higher levels. Data and resource sharing will be the norm. In addition, this will increase information transparency for everyone involved. This will encourage collaboration and communication between entities and creating more services and applications that further enhance the smart city. One example is the US government that collected and released a wide range of data, publications, and content in the name of transparency and openness. These offered the citizens and the government entities the chance to exchange and use the data effectively.
These benefit to be achieved require high levels of sophistication and involvement in terms of the applications, resources and people involved. The opportunities to achieve these benefits are available; however, they require investing in more technology, better development efforts and effective use of big data. There is also the need to set policies to ensure data accuracy, high quality, high security, privacy, and control of the data as well as using data documentation standards to provide guidance on the content and use of the datasets . In addition, technology can be very useful when considering the management and protection of environmental resources and infrastructures, and natural resources with the ultimate goal of increasing sustainability .
Big data applications have the potential to serve many sectors in a smart city . It helps provide better customer experiences and services, which help businesses achieve better performance (e/.g. higher profits or increased market shares). Improve healthcare by improving preventive care services, diagnosis and treatment tools, healthcare records management and patient care. Transportation systems can greatly benefit from big data to optimize route and schedules, accommodate for varying demands and being more environmentally friendly.
Deploying big data applications require the support of a good information and communication technology (ICT) infrastructure. ICT supports smart cities because it provides useful solutions and also unique solutions that may not be possible without it. For example, it enables efficient transport planning by providing easy ways to handle their services from different fields/locations to reduce transportation costs . Other examples include providing better water management and improved waste management by applying innovations to effectively manage these services. For example, waste management includes waste collection, disposal, recycling, and recovery , all of which can be efficiently managed using ICT solutions. More examples include new construction and structural methods for the health of buildings and better environment; risk management; safety and security; air quality and pollution; public health; urban sprawl; bio-diversity loss; and energy efficiency. In general, a smart city can be made smarter when utilizing ICT and big data for many of its applications and services.
Smart education : ICT provides a solution to enhance the education processes’ efficiency, effectiveness, and productivity using education smart services that are flexible and intelligent to provide better use of information, enhanced control and assessment, higher support for life-long learning for all people (citizens and stakeholders). Smart education applications will engage people in active learning environments that allow them to adapt to the rapid changes of society and the environment. In addition, by relying on big data collected in the field and correctly processed to generate the required information, we will have a positive effect on the knowledge levels and teaching/learning tools to deliver or acquire knowledge. Furthermore, technology can make such opportunities available everywhere including remote or rural areas where commuting to schools may not be possible or the economic status of people is low and they cannot afford other more expensive models. Using ICT and big data will also help create a knowledge-based society, which will enhance the nation’s capability in competitiveness. Big data in education is generated mainly by collecting data on people (e.g. students, teachers, parents, administrators, and other support personnel), infrastructures (e.g. schools, libraries, computing facilities, educational locations, museums, universities, and other related entities), and information (e.g. courses, books, exams, grades, economic surveys, assessments, reports, and much more). This data can create a useful resource for analysis and extracting useful trends, models and using them to offer better and more enhanced education. As an example, big data supports educational organizations to personalize learning , “create communities of practice and standardize the presentation of knowledge” . Big data in education can be also utilized to observe educational shortages to enhance study curriculums.
Smart traffic lights : One of the main aspects of smart cities is a good control of the traffic flow within the city, which will enhance the transportation systems and improve the citizens’ commutes and the cities overall traffic patterns. When the population increases, traffic problems, pollution, and economic problems happen. Due to this, the use of smart traffic lights and signals is one of the most important techniques that smart cities use to deal with high volumes of traffic and congestions. Smart traffic lights and signals should be interconnected across the traffic grids to offer more information about traffic patterns. Each sensor detects a different parameter of the traffic flow (e.g. the speeds of cars, traffic density, waiting time at the lights, traffic jams, etc.). The system makes decisions according to the values of these parameters and gives the appropriate instructions to the lights and signals. Thus, the more data available to this system, the more informed decisions it will be able to make. As a result, to offer the best possible services in smart traffic lights, it will be best to collect data from all traffic lights across the city and build intelligent decision systems using this data. This requires the use of real-time big data analytics. As an example, implementing smart traffic lights and signals designed by the Traffic21 project in Pittsburgh, Pennsylvania, USA obtained significant results, which reduced traffic jams and waiting times resulting in reduced emissions by over 20 %.
Smart grid: The smart grid is an important component of a smart city. It is a renovated electrical grid system that uses information and communication technology to collect and act on available data, such as information about the behaviors of suppliers and consumers, in an automated fashion to add some values . It improves the efficiency, reliability, economics, and sustainability of the production and distribution of electric power. A smart grid uses computer-based remote controls with two-way communication technology between power producers and consumers to increase grid efficiency and reliability through system self-monitoring and feedback. This involves placing smart sensors and meters on production, transmission, and distribution systems in addition to consumers access points to get granular near real-time data about the current power production, consumption, and faults. It implements dynamic pricing models for power usage to smooth out peaks by applying high charges during peak times and lower charges during other periods. This helps avoid potential power outages due to high consumer demands. It can provide consumers with near real-time information about their energy use and allow them to manage their usage based on both their needs and their affordable prices. Consumer devices such as washing machines and water heaters can be more cost-effective by controlling them automatically to operate during lower pricing periods. Although the smart grid has many potential benefits, it requires the collection of huge amount of data from power procedures, transmissions, distributors, and consumers . In addition, it requires processing the collected data, which is considered big data analytics, in real-time to send back some control information to improve the overall performance of the electric power system .
Examples of Big Data Projects in Smart City Components
Smart city components
Big Data Projects
Transportation, Mobility, and Logistics
An accelerated-time simulation for traffic flow (ATISMART model) based on the use of smart traffic lights and signals as a part of a smart city project. Accelerated-time simulations for traffic flow should take into consideration three different factors: the city map, the cars, and the smart signals. To implement a smart traffic flow, there are some requirements to consider such as network sensors, traffic lights, and CAS as the mathematical core of the model and Java for the GUI .
“Ministry of Health and Welfare initiated the Social Welfare Integrated Management Network to analyze 385 different types of public data from 35 agencies and comprehensively manage welfare benefits and services provided by the central government, as well as by local governments, to deserving recipients” .
“The Ministry of Food, Agriculture, Forestry, and Fisheries and the Ministry of Public Administration and Security, or MOPAS, plan to launch the Preventing Foot and Mouth Disease Syndrome system, harnessing big data related to animal disease overseas, customs/immigration records, breeding farm surveys, livestock migration, and workers in the livestock industry” .
“In 2004, to address national security, infectious diseases, and other national concerns, the Singapore government launched the Risk Assessment and Horizon Scanning (RAHS) program within the National Security Coordination Centre. Collecting and analyzing large-scale data sets, it proactively manages national threats, including terrorist attacks, infectious diseases, and financial crisis. … A notable REC application is exploration of possible scenarios involving importation of avian influenza into Singapore and assessment of the threat of outbreaks occurring throughout southeast Asia” .
NEdNet (National Education Network) is an integrated system including network infrastructure services, education information services (EIS), and learning services, which facilitate higher-order thinking skills, support learner-centered self-directed and tailored learning, and decision support .
Natural resources & energy
The UK government established the Horizon Scanning Centre (HSC) in 2004 to improve the government’s ability to deal with cross-departmental and multi-disciplinary challenges. In 2011, the HSC’s Foresight International Dimensions of Climate Change effort addressed climate change and its effects on the availability of food and water, regional tensions, and international stability and security by performing in depth analysis on multiple data channels .
Government & agency administration
“To manage real-time analysis of high volume streaming data, develop a massively scalable, clustered infrastructure. … For discovery and visualization of information from thousands of real-time sources, encompassing application development and systems management built on Hadoop, stream computing, and data warehousing” .
“In 2009, the U.S. government launched data.gov as a step toward government transparency and accountability. It is a warehouse containing 420,894 datasets covering transportation, economy, health care, education, and human services and the data source” .
“In 2011, Syracuse, NY, in collaboration with IBM, launched a Smarter City project to use big data to help predict and prevent vacant residential properties. Michigan’s Department of Information Technology constructed a data warehouse to provide a single source of information” .
Benefits of Big Data in Smart City Components
Smart City Components
Benefits of Big Data in Smart City Components
• Allow healthcare providers and practitioners to gather, analyze, and utilize patient information, which can also be used by insurance companies and some government agencies.
• Support processing complex occurrences to monitor, analyze, and flag potential health issues either on a daily basis or on a demand basis.
• Increase the amount and real-time nature of data gathered for certain patients’ healthissues through smart devices, which are connected to the home or hospital to monitor attributes like blood pressure, blood sugar, and sleep patterns for accurate and timely responses to health issues and for a comprehensive patient history records.
• Facilitate decision-making related to the supply levels of electricity in line with actual demand of the citizens and over all affecting conditions.
• Allow forecasting in a near-real time manner through efficient analysis of the big data collected.
• Align with strategic objectives (resource optimization) through specific pricing plans consistent with supplies, demand, and production models.
• Recognize traffic patterns by investigating real time data
• Reduce main city roads’ congestion by predicting traffic conditions and adjusting traffic controls. Through big data, the smart city will be able to reduce traffic and accidents by opening new roads, enhancing the infrastructure based on congestion data, and collecting information on car parking and alternative roads.
• Reduce supply chain waste by associating deliveries and optimizing shipping movements.
• Enable data streaming to process and communicate traffic information collected through sensors, smart traffic lights and on-vehicle devices to drivers via smartphones or other communication devices.
• Big data can be used to send feedback for specific entities to take action to alleviate or resolve a traffic problem.
• Provide weather information that will lead to improving the country’s agriculture, better informing people of possible hazardous conditions, and better management of energy utilization by providing more accurate predictions on demand.
• Provide detailed and spatial and temporal geographic area maps and help to easily determine whatever changes may happen.
• Help predict future environmental changes or natural disasters like earthquake detection that will give an opportunity to save lives and resources.
• Optimize academic research; for instance, astronomer can now analyze a huge astronomy dataset using powerful computers instead of manual analyses. By analyzing and exploring high quality digital images taken from space, new discoveries may happen in the fields. This is applicable to many science and research fields such as medical experiments, manufacturing operations, environmental studies, and economic and financial analysis.
• Behavior and matchmaking will lead to new knowledge. From assessment of graduates to online attitudes, each student generates a unique data track. By analyzing these data, education institutes can realize whether they are using their resources in the right places and producing the right results.
• Support the integration and collaboration of different government agencies and combine or streamline their processes. This will result in more efficient operations, better handling of shared data, and stronger regulation management and enforcement.
• Improve business decisions through big data analytics support. By researching a firm’s behavior and economic growth in addition to its rivals and environment conditions, more appropriate and effective decisions related to employment, production, and location strategies can be made.
• Publish new policies for the benefit of data owners (citizens) and producers (government agencies). Government agencies will help develop the quality of the data, while citizens will show how they can use the data and transfer it to new knowledge to enhance the quality of government services.
• Help governments focus on the citizens’ concerns related to health and social care, housing, education, policing, and other issues.
Data sources and characteristics: Data is generated from many different sources in many different formats. There are a lot of new data formats many of which are unstructured (e.g. images, audio, tweets, video, server logs, etc.). This data need to be managed and classified into a structured format using some form of advanced database systems . Many identified different Vs of big data the most agreed upon are the 3 Vs: Velocity, Volume and Variety. Several more were added such as Validity, Veracity, Volatility and Value  as well as Variability . Just trying to encompass these different attributes of big data generates very complex models and approaches and make it hard to manage. This is simply because the current methodologies or data mining software tools cannot handle the large size and complexity. In addition, there are some challenges that may be faced in the future, such as analytics architecture, evaluation, distributed mining, time evolving data, compression, visualization, and hidden big data . When considering smart city applications utilizing big data difficulties arise in various areas. For one, collecting the data by itself is complicated by the existence of multiple sources with different formats and types and different usage and access policies. In addition, the unstructured nature of the data make it hard to categorize and organize and an easily accessible way for applications to use.
Data and information Sharing: Sharing data and information among different city departments is another challenge. Each government and city agency or department typically has its own warehouse or silo of confidential or public information. Most of which are often reluctant to share what might be considered proprietary data. In addition, some data may be governed by certain privacy conditions that make them hard to share across different entities. The challenge here is to make sure not to cross the fine line between collecting and using big data and ensuring citizens’ rights of privacy . This is applicable within any smart city since there are many sectors and industries involved. Smart city applications will need to find ways to prevent or reduce the barriers to achieve seamless information sharing and exchange among different entities . Furthermore, with multiple diverse data sources distributed among related departments, some data types such as spatio-temporal data can be updated quickly . Therefore, it is difficult to create a unified understanding of data semantics, and extract new knowledge based on specific cycle data and real-time data. As result, it will be difficult to create a knowledge base for a smart city.
Data Quality: Looking at more fundamental aspects of big data, there are a number of challenges that are associated with the quality of the data. Data captured by different people under special regimes and stored in distinctive databases is rarely stored in any standard formats . Relying on crowd sourcing and collaboration of multiple providers will result in data that suffers from a lack of structure and consequently consistency, heterogeneity, and disparity issues will have a greater chance to occur. Accordingly, “there is no universal way to retrieve and transform the data automatically and universally into a unified data source for useful analysis” . That will cause more challenges like data uncertainty and trustworthiness. For example, sensor data collected through a third party without a centralized control could have been produced by sensors that are faulty, wrongly calibrated, or beyond their lifetime. The challenge may also extend to the outputs of analysing existing data (given the possibility of errors) and reporting the results for use by others, who may not be aware of such issues. Therefore, continuously updating data gathering and usage policies, sharing and discussing them among all entities in a smart city, ensuring that the citizens understand and apply the policies correctly is vital and challenging at the same time .
Security and privacy: Another one of the major challenges in a smart city and with using big data is the security and privacy issues. In basic terms this mean that databases may include confidential information related to the government and people, so they need high levels of security policies and mechanisms to protect this data against unauthorized use and malicious attacks. In addition, smart applications integrated together across agencies also require high security since the data will move over various types of networks, some of which may be pen or unsecure . What makes such an issue more complex is that most big data technologies today, including Cassandra and Hadoop, suffer from a lack of sufficient security . In addition to the need to secure data as it travels and as it is being used by the different components of smart city applications, there is also the need to clearly identify and protect privacy rights of organizations and individuals this data represents. Although specific smart city entities can claim ownership of most big data, a lot of it include personal and private information about individuals. Health and medical records, financial and bank records, retail history, and much more all provide intimate views of the people they represent. Many view access to this type of data as a violation of a person’s legal rights for privacy. Making sure that stringent privacy policies are put in place and properly enforced represents a major challenge for big data smart city applications developers and users.
Cost: Cost is a sensitive subject that involve the ways public authorities may affect people when they use ICT solutions. For example, using an energy usage reduction system , which forces the government to use new systems, components or features to monitor consumption and record information. This leads to creating a smart energy management system; however, it is also a very expensive to implement . In addition if such a project is not implemented correctly from the beginning, it might cause a big problem, result in very high costs, and the city may be negatively affected. For example, the testing of a smart traffic light and signal system has a very high cost. These tests produce not only high costs in resources but also in traffic problems while physically deploying and testing the system . Because of this, it will require replacing expensive hardware and software for further development and monitoring of smart city infrastructure and applications .
Smart City Population: People affect and are affected by the smart applications . Particularly the city’s population size have a great effect on the size of big data. As the population grows, the size of generated data also rapidly grows and can become massive. This is one of the main challenges because the rapid growth will generate traffic congestion, pollution, and increasing social inequality  besides increased urbanization, which raises a variety of technical, social, economic, and organizational problems that tend to jeopardize the economic and environmental sustainability of cities . As a result, smart city applications need to evolve quickly and extend efficiently to handle the growing volume and variety of big data to help avoid such problems. Ultimately, the goal is to develop and deploy smart city applications that are smart enough to evolve and intelligently handle the rapid growth of big data to generate better result.
As discussed above there are several facing smart city applications relying on big data. These challenges have varying effects and implications on such applications and pose varying levels of difficulty and complexity. Furthermore, different applications have different requirements for data usage. For example, traffic control requires immediate responses from the application to control traffic in real-time; while environmental sustainability applications may be able to handle more delayed responses as decisions are generally made over longer periods of time. Therefore, real-time transfer, discovery, analysis, decision-making, and responses is an issue; however, the degrees of its importance varies with the application . More over achieving real-time responses depends heavily on how well we address the challenges we discussed above.
This section will cover the key components required to design and implement smart city applications utilizing ICT and big data components. Data collection and capturing from sensors, users, electronic data readers and many others pose the first issue to handle as the volume rapidly grows. Storing, organizing and processing this data to generate useful results in the next issue. Fundamentally, to have effective solutions, it is required to select a number of design and development priorities in a planned manner, for example flexible design, quick deployment, achieving more thorough sense, more comprehensive interconnections, and more intelligence . To further complicate the issues, handling interconnected communication infrastructures to access contextual information in smart city applications and physical spaces to support good decision making processes requires attention to various aspects of connectivity, security and privacy .
The applications of big data to smart cities can be classified into two types, offline big data applications and real-time big data applications. Real-time big data applications are different because they rely on instantaneous input and fast analysis to arrive at a decision or action within a short and very specific time line . In many cases, if a decision cannot be made within that timeline, it becomes useless. As a result, it is important to make all data necessary for such decision available in a timely fashion and that the analysis is done in a fast and reliable way. As a result, real-time big data applications usually need higher technological requirements. Big data applications for smart city planning in areas like energy, traffic, education, and healthcare are considered offline. However, those needed to provide interactive actions, enhancements and controls for intelligent applications are real-time applications .
Big Data Management: The key advantage of smart city applications is that they generate large volumes of data in a variety of formats and from many sectors such as traffic, energy, education, and healthcare, and manufacturing. This data is generated and collected in massive amounts and on a regular basis, thus offering real-time view of what is happening in the city at any time. To ensure proper and useful utilization of this data in smart city applications, it is important to have suitable and effective big data management tools in place. Big data management includes development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs throughout its use in smart city applications. As the data comes from different sources with different formats, there is a need for advanced data management features that will lead to recognizing the different formats and sources of data, structuring, managing, classifying, and controlling all these types and structures. Big data management for smart city applications should also provide scalable handling for massive data to support offline applications as well as low latency processing to serve effectively in real-time applications. The concepts, techniques, and challenges of big data management are discussed further in [25, 26] and .
Big Data Processing Platforms: Big data applications for smart cities need to perform data analytics that usually require huge processing capability. This leads to the need for scalable and reliable software and hardware platforms. The software platforms for smart cities should offer high performance computing capabilities, be optimized for the hardware being used, is stable and reliable for the different data-intensive applications being executed, supports stream processing, provides a high-levels of fault resilience, and is supported by a well-trained and capable team and vendor. There are different available software platforms for big data analytics such as Hadoop Mapreduce , HPCC , Stratosphere , and IBM Infosphere Streams , which provide the stream processing required by real-time big data applications such as intelligent transportations in a smart city . These platforms work well on cluster systems that can provide a powerful and scalable hardware platform to meet the requirements of big data applications for smart cities. Big data can be also processed on the Cloud using both big data Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) . This will relieve the application owners from the Burdon of securing dedicated platforms, which is usually very costly and allow them to use well tested highly reliable platforms offered by the Cloud service providers.
Smart network infrastructure: Most big data applications for smart cities require to have smart networks connecting their components including residents’ equipment such as cars, smart house devices, and smart phones. This network should be capable of efficiently transferring collected data from their sources to where big data is collected, stored, and processed and to transfer responses back to the different entities that need them in the smart city. The quality of service (QoS) support in the network is extremely important for real-time big data applications for smart cities. In these applications, all current distributed application events should be transferred in real-time to where they can be processed. These events can be transferred from their sources as raw events or as filtered or aggregated events. All generated current row, filtered, and aggregated events can be transferred to a centralized processing point or to distributed intermediate processing points in the smart network for pre-processing or for further filtering and aggregation before being transferred to the main decision making unit. The centralized approach is good if the current generated events are not huge and there are no limitations on the network resources used to transfer these events. The distributed approach is more suitable for huge events such that it is inefficient and sometimes impossible to transfer all the generated events to a single location within acceptable performance and time bounds. Filtering and aggregation will become important in this case especially for smart cities as it can help reduce the amount of generated network traffic and speed up data processing. This can be done at the event sources and the intermediate points using an open-loop or a closed-loop approach. In open-loop approach filtering and aggregation policies are pre-defined while in closed-loop approach filtering and aggregation policies are interactively defined based on the current events and decisions, current system and network resources, or external smart city application policies. In both approaches, event filtering and aggregation should be done without compromising the integrity, accuracy and correctness of the data being aggregated. This is important to preserve the quality of the decision making process in the real-time big-data applications .
Advanced Algorithms: Standard algorithms used in regular applications may not be sufficient or efficient enough to handle big data applications due to their unique requirements and pressing need for high volume high speed processing. For example, most available data mining algorithms are not very suitable for big data mining applications as their design is based on limited and well defined data sets . Big data applications for smart cities will need to implement advanced and more sophisticated algorithms to deal with big data efficiently. Some of these algorithms need to be designed for real-time application support while others can be designed for batch or offline processing. These algorithms need to be optimized to handle high data volumes, large variety of data types, time constraints on decision making processes, and distributed components across various geographical locations. In addition, these algorithms need to work effectively across heterogeneous environments and be capable of managing and operating in highly dynamic environments.
Open Standard Technology: As big data smart city applications involve large scale heterogeneous systems and data, it is advantageous to follow an open standard for designing and implementing such solutions. This will add flexibility for upgrading, maintaining, and adding more application features for smart cities. In addition, this will facilitate the integration among smart city components and big data components. In addition, it is primary to set standard rules for new applications to achieve easy integration between the available smart city infrastructure and environment and the introduced big data applications. This can be achieved by performing a full study of the government entities, stakeholder, and the infrastructure to assess the readiness to be part of a future smart city . Based on such study, regulations, standard models of design and rules can be developed for big data applications development for the smart city.
Security and Privacy: Given that most data collected and processed in smart city applications will contain some form of sensitive or private information, it is important to ensure that all technology and applications components include and maintain acceptable levels of security and privacy mechanisms. Although a smart city provides many positive advantages for its residents, it also poses several threats to their safety, wellbeing and privacy by relying heavily on their data. The possibility of illegal access or malicious attacks to such infrastructures can lead to catastrophic results affecting the city infrastructure, its government entities and its residents. Big data applications designers and developers must include security and privacy policies and procedures as an integral part of the design and implementation of their applications. Clear guidelines and requirements must be identified from the various users to be enforced in the applications.
Citizen Awareness: Citizens must be aware of how to use ICT solutions for smart city correctly and safely. Their active participation in providing information related to the different issues they may encounter with smart city applications will help in enhancing the quality of collected data and the performance of the applications. As a result, more effective decisions can be made from collected big data to enhance different smart city components. Another important aspect in citizen awareness is their knowledge and practice of good safety, security and privacy practices. Adequate training and awareness campaigns need to be done to make sure that people are aware and capable of protecting their own data and environment.
Government Role: Governing entities of smart cities must establish guiding principles of openness, transparency, participation, and collaboration to keep the exchange and flow of big data under control . Governments play an essential role in a smart city; therefore, it is required to have advanced systems to manage big data collected and used by government entities. In addition, the government must review and recalibrate information and data policies as necessary by focusing on privacy, data reuse, data accuracy, data access, archiving, and preservation . Therefore, it must have well-defined data documentation and codebooks to ensure informed use of the datasets . To effectively support big data applications, smart city government should balance the beneficial uses of data against individuals’ privacy concerns by addressing some of the fundamental concepts of privacy laws. This includes defining “personally identifiable information”, and the role of individual control .
Along with these general non-functional requirements for big data applications, each application will also have its own set of functional and operational requirements. These requirements are gathered and analyzed when the application is being considered for development in the smart city. Together the two sets of requirements should fully define all the necessary requirement and resources to successfully design, develop test and deploy the required application. As the different requirements for big data smart city applications are gathered, it may be also helpful to use simulations to help improve and predict the outcomes of such applications. Simulation techniques offer a different more realistic view of how the applications may behave and hat the expected outcomes will be. This approach helps reduce a system’s cost in the implementation and testing phases and in optimize the required resources for the project. Examples of such techniques are accelerated-time simulations of traffic flow (ATISMART model) that give the users a chance to interact easily, as the Graphical User Interface (GUI) allows the system to be dynamic and flexible as well as reducing the cost of implementing traffic lights and signals .
6 Discussion and open issues
Despite the prevalence of the smart city phenomena worldwide, there is obscurity facing its definition. The general perception currently is “I know it when I see it”, which implies some known characteristics that can be recognized in a smart city, yet, they are still not well defined. Yet there seems to be an agreement on what a smart city will achieve to its citizens and the environment. In general a smart city will improve governance, enhance the economic standing of the city, improve the quality of life of its citizens, and help create an environmentally friendly and sustainable infrastructures. This has led to highlighting several common characteristics, features and components that may specify the perspectives of a smart city. These include the intensive use of ICT and next generation information technology, this integration of the physical and social components of the city via the use of ICT, implementing advanced monitoring and control tools and applications to enhance efficiency and quality, and improving the infrastructures to support better quality of life and higher sustainability.
These aspects affect each smart city proposal regardless of its size. In general, governments around the world are also concerned about the cost and benefits of implementing a smart city. Many worry about the financial patterns, available resources levels, and their capabilities regarding regulation systems as they pose challenges to tackle. Conversely new technologies can help change mitigate some of the challenges and offer more opportunities for success. In addition, there is a huge potential for using big data to address many of the issues involved in smart cities using analytics for deeper insights and better decision making practices. Furthermore, the cloud offers additional opportunities to implement and deploy ICT solutions for smart cities and support collaboration between different applications in a smart city. The vast advances ICT, the Cloud, information technology, and big data offer cities more capabilities to be smarter than was ever possible just a short time ago.
Since big data is viewed as a strong enabler for smart city applications, we studied and compared its different definitions earlier. The various Vs of big data show how complex and difficult it is to collect, manage, store, and analyze big data. However, the sheer volume and variety of big data offer a great opportunity to create smart applications that respond effectively to current data and offer accurate tools for decision making. Including big data applications to support smart cities is not without challenges; however, successful implementations will take propel a city far ahead in terms of how smart it is. With this visionary technology availability, multiple countries around the world like South Korea, the US, and the UAE are encouraged to build and support smart cities.
Understanding the characteristics of smart cities and acknowledging the need for advanced big data and ICT support facilitates the process of putting all these technologies together to start building smart city applications. Policy makers can now explore how to plan and construct smart cities. These can be viewed in three categories: “public infrastructure, construction of public platform for smart city, and construction of application systems” . Each one of these categories involve issues and challenges that can be considered future study fields. As plans progress and more research and development efforts are poured into smart city design, many of the issues and challenges will be addressed and solutions will be reached. As a result more cities will start to become smarter and the overall quality of life will improve.
Set up the smart city’s direction by identifying its mission, vision and strategic and operational objectives.
Establish policies, principles, resources and expertise guidelines to control ICT and big data usage.
Build smart-ready public infrastructures and platforms including the ICT required to support smart city applications. This will involve evaluating and analyzing the current situations and the necessary changes and additions to reach the desired result.
Identify priorities and use them to determine the most important smart city components and applications that would offer the greatest effects with the smallest investment.
Integrate infrastructures, services and big data smart city applications to develop better and more efficient citizen experiences.
Optimize smart city services and operations using the collected data and the smart applications to enhance services and identify infrastructure and environmental improvements needs.
Realize new opportunities for further development by monitoring current developments and their effects and the arising issues and new requirements.
Developing simulation systems to help predict and view possible changes and forecast potential problems. This will help avoid or at least reduce some of the risks involved and in many cases also help reduce implementation and testing costs.
Benefitting from other smart city experiences to follow successful models and avoid problematic approaches.
Benefitting from experts and researchers to study available market systems (smart systems/services, data systems) and also research new possibilities for more advanced systems that suite the smart city and objectives.
- 4.Investigating the correlation between big data and smart city applications. This understanding will help include the right data into the right applications to reach better decisions and optimize various functions in the smart city. Gartner, as an example, provides a simple diagram illustrating the values of such studies (see Fig. 3) .
Is Social Media an important data source in smart cities and how communication will look like between governments, citizens, and businesses? When everything is connected and integrated, should all entities public and private have access and rights to the same information and knowledge?
Security and privacy issues are another important issue to be carefully considered. When all systems are integrated, data will be shared among all entities in the smart city. Therefore, the infrastructure and platforms must be secured, privacy must be preserved and information must be fully protected.
The political considerations and effects on any city play a role on how we (or not) it will perform and that also applies too smart cities. The privilege of access to information by different people in different power or political positions must be taken in consideration and addressed carefully.
The side effects of using technology is another issue to study. Since we will have a communication infrastructure that spans private and public networks many of which may be wireless we must consider all the possible risks and consequences of their use. In addition, many devices owned and operated by different people for various purposes and in so many different level of experience with ICT will be no board. It is generally unknown how this level of interaction with technology will affect the users and whether there will be negative effects on them. For example, many talk about the harmful effects of having cell phones nearby for extended periods of time, thus it is also logical to question the effects of all these technologies being included smart city citizens’ lives.
The need for highly educated well qualified people to design, develop, deploy and operate smart city infrastructures, platforms and applications is growing rapidly. Specialized education and training in these field need to be developed and offered to create this type of workforce.
There is also the need to set common measurements and control policies for smart applications. Monitoring and control of initiatives and implementations using different tools and techniques is required in a smart city to ensure the correctness, effectiveness and quality f deployed smart city applications.
Smart city and big data are two modern and important concepts; therefore, many started integrating them to develop smart city applications that will help reach sustainability, better resilience, effective governance, enhanced quality of life, and intelligent management of smart city resources. Our study explored both concepts and their different definitions and we came to identify some common attributes for each. Despite the varying definitions each concept has a number of characteristics that uniquely defines it. Relying on these common characteristics, we were able to identify the general benefits of using big data to design and support smart city applications.
From there, we discussed the various opportunities available and this will result in building smart applications capable of utilizing all available data to enhance their operations and outcomes. We also discussed the various challenges in this domain and identified several issues that may hinder big data applications development efforts. Based on that discussion, we suggested a list of general requirements for big data smart city applications. There requirements are necessary to design and implement effective and efficient applications. In addition, these requirements also try to address the challenges and propose different ways to resolve some of the issues and generate better results. Finally we discussed some of the main open issues that need to be further investigated and addressed to reach a more comprehensive view of smart cities and develop hem in a holistic well thought out model.
Building and deploying successful big data smart city applications will require addressing the challenges and open issues, following rigorous design and development models, having well trained human resources, utilizing simulation models and being ell prepared and well supported by the governing entities. With all success factors in place and better understanding of the concepts, making a city smart will be possible and further enhancing it for smarter models and services will be an attainable and sustainable goal.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Pantelis K, Aija L. Understanding the value of (big) data. In Big Data, 2013 IEEE International Conference on IEEE; 2013. pp. 38–42.Google Scholar
- Khan Z, Anjum A, Kiani SL. Cloud Based Big Data Analytics for Smart Future Cities. In Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing. IEEE Computer Society; 2013. pp. 381–386.Google Scholar
- Kitchin R. The real-time city? Big data and smart urbanism. GeoJournal. 2014;79(1):1–14.View ArticleGoogle Scholar
- Townsend AM 2013. Smart cities: big data, civic hackers, and the quest for a new utopia. WW Norton & Company.Google Scholar
- Batty M. Big data, smart cities and city planning. Dialogues Hum Geog. 2013;3(3):274–9.MathSciNetView ArticleGoogle Scholar
- Vilajosana I, Llosa J, Martinez B, Domingo-Prieto M, Angles A, Vilajosana X. Bootstrapping smart cities through a self-sustainable model based on big data flows. Commun Mag, IEEE. 2013;51(6):128–34.View ArticleGoogle Scholar
- Michalik P, Stofa J, Zolotova I. Concept definition for Big Data architecture in the education system. In Applied Machine Intelligence and Informatics (SAMI), 2014 IEEE 12th International Symposium on 2014. pp. 331–334.Google Scholar
- Fan W, Bifet A. Mining big data: current status, and forecast to the future. ACM SIGKDD Explor Newsl. 2013;14(2):1–5.MATHView ArticleGoogle Scholar
- Al-Hader M, Rodzi A. The smart city infrastructure development & monitoring. Theor Empir Res Urban Manage. 2009;4(2):87–94.Google Scholar
- Bertot JC, Choi H. Big data and e-government: issues, policies, and recommendations. In Proceedings of the 14th Annual International Conference on Digital Government Research. ACM; 2013. pp. 1–10.Google Scholar
- Kramers A, Höjer M, Lövehagen N, Wangel J. Smart sustainable cities–Exploring ICT solutions for reduced energy use in cities. Environ Model Software. 2014;56:52–62.View ArticleGoogle Scholar
- Neirotti P, De Marco A, Cagliano AC, Mangano G, Scorrano F. Current trends in Smart City initiatives: Some stylised facts. Cities. 2014;38:25–36.View ArticleGoogle Scholar
- Tantatsanawong P, Kawtrakul A, Lertwipatrakul W. Enabling future education with smart services. In SRII Global Conference (SRII), 2011 Annual IEEE; 2011. pp. 550–556.Google Scholar
- West DM. Big Data for Education: Data Mining, Data Analytics, and Web Dashboards. Governance Studies at Brookings. 2012. Available athttp://www.brookings.edu/~/media/Research/Files/Papers/2012/9/04%20education%20technology%20west/04%20education%20technology%20west.pdf
- Marsh O, Maurov-Horvat L, Stevenson O. Big Data and Education: What’s the Big Idea?. UCL Policy Briefing. 2014. Available at https://www.ucl.ac.uk/public-policy/public-policy-briefings/big_data_briefing_final.pdf
- Aguilera G, Galan JL, Campos JC, Rodríguez P. An Accelerated-Time Simulation for Traffic Flow in a Smart City. FEMTEC. 2013;2013:26.Google Scholar
- U.S. Department of Energy, “Smart Grid / Department of Energy,” Web: http://energy.gov/oe/technology-development/smart-grid, RetrievedSep. 23, 2015.
- Yin J, Sharma P, Gorton I, Akyoli, B. Large-Scale Data Challenges in Future Power Grids. In Service Oriented System Engineering (SOSE), 2013 IEEE 7th International Symposium on IEEE; 2013. pp. 324–328.Google Scholar
- Mohamed N, Al-Jaroodi J, “Real-time big data analytics: Applications and challenges,” High Performance Computing & Simulation (HPCS), 2014 International Conference on, vol., no., 2014. pp. 305,310.Google Scholar
- Khan M, Uddin MF, Gupta N. Seven V’s of Big Data understanding Big Data to extract value. In American Society for Engineering Education (ASEE Zone 1), 2014 Zone 1 Conference of the IEEE; 2014. pp. 1–5.Google Scholar
- Su K, Li J, Fu H. Smart city and the applications. In Electronics, Communications and Control (ICECC), 2011 International Conference on IEEE; 2011. pp. 1028–1031.Google Scholar
- Lee CH, Birch D, Wu C, Silva D, Tsinalis O, Li Y, Guo Y. Building a generic platform for big sensor data application. In Big Data, 2013 IEEE International Conference on IEEE; 2013. pp. 94–102.Google Scholar
- Kim GH, Trimi S, Chung JH. Big-data applications in the government sector. Commun ACM. 2014;57(3):78–85.View ArticleGoogle Scholar
- Chourabi H, Nam T, Walker S, Gil-Garcia JR, Mellouli S, Nahon K, Scholl HJ. Understanding smart cities: An integrative framework. In System Science (HICSS), 2012 45th Hawaii International Conference on IEEE; 2012. pp. 2289–2297.Google Scholar
- Xiaofeng M, Xiang C. Big data management: concepts, techniques and challenges [J]. J Comput Res Dev. 2013;1:98.Google Scholar
- Borkar V, Carey MJ, Li C. Inside Big Data management: ogres, onions, or parfaits?. In Proceedings of the 15th International Conference on Extending Database Technology. ACM; 2012. pp. 3–14.Google Scholar
- Chaudhuri S. What next?: a half-dozen data management research goals for big data and the cloud. In Proceedings of the 31st symposium on Principles of Database Systems. ACM; 2012. pp. 1–4.Google Scholar
- Dittrich J, Quiané-Ruiz JA. Efficient big data processing in Hadoop MapReduce. Proc VLDB Endowment. 2012;5(12):2014–5.View ArticleGoogle Scholar
- Middleton A, Solutions PDLR. Hpcc systems: Introduction to hpcc (high-performance computing cluster). White paper, LexisNexis Risk Solutions; 2011.Google Scholar
- Alexandrov A, Bergmann R, Ewen S, Freytag JC, Hueske F, Heise A, et al. The Stratosphere platform for big data analytics. VLDB J. 2014;23(6):939–64.View ArticleGoogle Scholar
- Biem A, Bouillet E, Feng H, Ranganathan A, Riabov A, Verscheure O, Moran C. Ibminfosphere streams for scalable, real-time, intelligent transportation services. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data ACM; 2010. pp. 1093–1104.Google Scholar
- Ji C, Li Y, Qiu W, Awada U, Li K. Big data processing in cloud computing environments. In Pervasive Systems, Algorithms and Networks (ISPAN), 2012 12th International Symposium on IEEE; 2012. pp. 17–23.Google Scholar
- Wu X, Zhu X, Wu GQ, Ding W. Data mining with big data. IEEE Trans Knowl Data Eng. 2014;26(1):97–107.View ArticleGoogle Scholar
- Tene O, Polonetsky J. Big data for all: Privacy and user control in the age of analytics. Nw J Tech Intell Prop. 2012;11:xxvii.Google Scholar
- Business analytics from basics to value, Gartner, Retrieved 4 May 15, Published on Jun 10, 2014, available athttp://www.slideshare.net/sucesuminas/business-analytics-from-basics-to-value.