Skip to main content

Applications of big data to smart cities


Many governments are considering adopting the smart city concept in their cities and implementing big data applications that support smart city components to reach the required level of sustainability and improve the living standards. Smart cities utilize multiple technologies to improve the performance of health, transportation, energy, education, and water services leading to higher levels of comfort of their citizens. This involves reducing costs and resource consumption in addition to more effectively and actively engaging with their citizens. One of the recent technologies that has a huge potential to enhance smart city services is big data analytics. As digitization has become an integral part of everyday life, data collection has resulted in the accumulation of huge amounts of data that can be used in various beneficial application domains. Effective analysis and utilization of big data is a key factor for success in many business and service domains, including the smart city domain. This paper reviews the applications of big data to support smart cities. It discusses and compares different definitions of the smart city and big data and explores the opportunities, challenges and benefits of incorporating big data applications for smart cities. In addition it attempts to identify the requirements that support the implementation of big data applications for smart city services. The review reveals that several opportunities are available for utilizing big data in smart cities; however, there are still many issues and challenges to be addressed to achieve better utilization of this technology.

1 Introduction

Undoubtedly, the main strength of the big data concept is the high influence it will have on numerous aspects of a smart city and consequently on people’s lives [1]. Big data is growing rapidly, currently at a projected rate of 40 % growth in the amount of global data generated per year versus only 5 % growth in global IT spending. Around 90 % of the world’s digitized data was captured over just the past two years. As a result, many governments have started to utilize big data to support the development and sustainability of smart cities around the world. That allowed cities to maintain standards, principles, and requirements of the applications of smart city through realizing the main smart city characteristics. These characteristics include sustainability, resilience, governance, enhanced quality of life, and intelligent management of natural resources and city facilities. There are well-defined components of the smart city, such as mobility, governance, environment, and people as well as its applications and services such as healthcare, transportation, smart education, and energy [2]. To facilitate such applications and services large computational and storage facilities are needed. One way to provide such platforms is to rely on Cloud Computing and utilize the many advantages of using cloud services to support smart city big data management and applications. Figure 1 demonstrates how cloud computing can support big data collection, storage and analysis across cloud nodes and facilities.

Fig. 1
figure 1

Using the Cloud to store data generated from different components of a smart city [2]

Current work and research projects in this field have generated some literature that highlighted the importance of big data in supporting smart city applications and services. In addition, some work investigated some of the issues of utilizing big data in smart cities [36]. The main contribution of this paper is reviewing the application of big data in smart city and exploring the opportunities and challenges for utilizing big data in smart city. In addition, the paper investigates the general requirements for the design and implementation of big data based applications for smart city applications and services.

This paper will first, in Section 2, introduce the concepts of a smart city, big data, and applications of big data in a smart city. We will also investigate the current definitions of these concepts available in the literature and we will compare them. In Section 3 we will discuss the benefits and opportunities of smart cities, big data, and their applications and in Section 4 we will identify the challenges of using big data for smart city applications and services. We will then move to offering an overview of the general requirements to implement smart city applications based on big data in Section 5. In Section 6 we will discuss and illustrate some open issues that may help other researchers start their research in the field and in Section 7 we will conclude the paper.

2 Background

The smart city concept has different connotations from the people’s perspective versus the technological perspective. This is clear when countries set initiatives to become smart cities because they give different points of view around the smart city. Although there is a prevalence of the smart city phenomena worldwide, there is obscurity its definition. “The smart city sector is still in the ‘I know it when I see it’ phase, without a universally agreed definition”. In other words, a shared definition of a smart city is not yet offered, and it has been difficult to pinpoint a standard global meaning. However, the majority of definitions highlight common characteristics, features, and components that may specify the perspectives of smart cities. Examples include the enhancement of the quality of life for a particular segment–city citizens–through utilizing information technology hardware, software, networks, and data on different city areas and services. It could also involve various city components like natural resources, infrastructures, power, transportation, education, healthcare, government, and public safety. Table 1 depicts different definitions of a smart city that focus on some of these different areas.

Table 1 Definitions of smart city and the differences and similarities between them

From the offered definitions we can view the smart city as an integrated living solution that links many life aspects such as power, transportation, and buildings in a smart and efficient manner to improve the quality of life for the citizens of such city. In addition the definitions also focus on the future by emphasizing the importance of sustainability of resources and applications for the future generations. We observed these aspects on each smart city proposal regardless of size, location and available resources. In general, governments around the world are mostly concerned about the cost of acquiring a smart city due to the varying financial abilities and the scarcity of resources, natural or human. The availability and size of such resources and their capabilities is one of the challenges of building and maintaining a smart city. Another challenge is the regulatory systems that could greatly affect the chances of success. To top all that there are also the technical challenges requiring highly advanced technological solutions. Conversely, new and emerging technologies can help transform such challenges into opportunities.

Data is being generated from multiple sources resulting in the formation of what is currently known as big data. Data sources are around us everywhere, smart phones, computers, environmental sensors, cameras, GPS (Geographical Positioning Systems), and even people. Various applications like social media sites, digital pictures and videos, commercial transactions, advertising applications, games and many more helped accelerate data generation in the past few years [2, 7]. There are several big data definitions, see Table 2. Each offers a different view of the concept, yet together, we believe they offer a full picture of the concept. Big data may be catalogued and stored at various sites, owned by different entities and yet mostly sits unused. Furthermore, there is a variety potential uses of big data to address problems directly from the source as well as analytics for deeper insights through data analytics, data intelligence and data mining. To further facilitate this huge demand for resources to support big data analytics, the Cloud stepped in and offered an elegant and efficient solution. The Cloud is a suitable platform for highly resource intensive applications for active collaboration between different applications. This fits very well with the requirements of smart city applications and could help resolve some of its challenges. Through these technological uses, smart cities have higher possibilities to be smarter than ever and achieve their goals more effectively and efficiently.

Table 2 Four definitions of big data

Figure 2 shows the employment of big data applications in smart cities. Smart city applications generate huge amounts of date while big data systems utilize this data to provide information to enhance smart cities applications. The big data systems will store, process, and mine smart cities applications information in an efficient manner to produce information to enhance different smart city services. In addition, the big data will help decision-makers to plan for any expansion in either smart city services, resources, or areas.

Fig. 2
figure 2

Smart city and big data relationship

In addition, there are some characteristics and features of big data that are called the Vs of big data management. According to [8] these include the main 3 Vs (1, 2 and 3) and two additional Vs:

  1. 1.

    Volume: refers to the size of data that has been created from all the sources.

  2. 2.

    Velocity: refers to the speed at which data is generated, stored, analyzed and processed. An emphasis is being put recently on supporting real-time big data analysis.

  3. 3.

    Variety: refers to the different types of data being generated. It is common now that most data is unstructured and cannot be easily categorized or tabulated.

  4. 4.

    Variability: refers to how the structure and meaning of data constantly changes especially when dealing with data generated from natural language analysis for example.

  5. 5.

    Value: refers to the possible advantage big data can offer a business based on good big data collection, management and analysis.

Others also mention a few more Vs of big data that cover some more aspects. For example volatility, which refers to the retention policy of the structured data implemented from different sources. Also there is validity that refers to the correctness, accuracy, and validation of the data. In addition there is veracity, which refers to the accuracy and truthfulness of the captured data and the meaningfulness of the results generated from the data for certain problems.

The various characteristics of big data demonstrate the huge potential for gains and advancements. The possibilities are endless; however, bounded by the available technologies and tools available. For big data to achieve its goals and advance services in smart cities, it needs the right tools and methods to be analyzed and classified effectively and efficiently. By understanding the available capabilities and limitations, we can capture many opportunities for better services and applications for smart cities using big data.

3 Benefits and opportunities

Currently, many cities compete to be smart cities in hopes of reaping some of their benefits economically, environmentally and socially. As a result, may are eying the opportunities made possible by using big data analytics in smart city applications. Therefore, we will discuss in this section some of the benefits and opportunities that may help in making the decision to convert or redesign a city to become a smart city. With such decision, it may be possible to achieve enhanced levels of sustainability, resilience, and governance. In addition to improving the citizen’s quality of life and introducing intelligent management of infrastructures and natural resources [2]. Some of the benefits of having a smart city include the following:

  1. 1.

    Efficient resource utilization: With many resources becoming either scarce or very expensive, it is important to integrate solutions to have better and more controlled utilization of these resources. Starting with technological systems such as Enterprise resource planning (ERP) and Geographic Information System (GIS) [9] will be useful. With monitoring systems at work, it will be easier to spot waste points and better distribute resources while controlling costs, and reducing energy and natural resources consumption. In addition, one of the important aspects of smart city applications is that they are designed for interconnectivity and data collections which can also facilitate better collaboration across applications and services.

  2. 2.

    Better quality of life: With better services, more efficient work and living models, and less waste (in time and resources), smart city citizens will have a better quality of life. This is the result of better planning of living/work spaces and locations, more efficient transportation systems, better and faster services, and the availability of enough information to make informed decision.

  3. 3.

    Higher levels of transparency and openness: The need for better management and control of the different smart city aspects and applications, will drive the interoperability and openness to higher levels. Data and resource sharing will be the norm. In addition, this will increase information transparency for everyone involved. This will encourage collaboration and communication between entities and creating more services and applications that further enhance the smart city. One example is the US government that collected and released a wide range of data, publications, and content in the name of transparency and openness. These offered the citizens and the government entities the chance to exchange and use the data effectively.

These benefit to be achieved require high levels of sophistication and involvement in terms of the applications, resources and people involved. The opportunities to achieve these benefits are available; however, they require investing in more technology, better development efforts and effective use of big data. There is also the need to set policies to ensure data accuracy, high quality, high security, privacy, and control of the data as well as using data documentation standards to provide guidance on the content and use of the datasets [10]. In addition, technology can be very useful when considering the management and protection of environmental resources and infrastructures, and natural resources with the ultimate goal of increasing sustainability [11].

Big data applications have the potential to serve many sectors in a smart city [8]. It helps provide better customer experiences and services, which help businesses achieve better performance (e/.g. higher profits or increased market shares). Improve healthcare by improving preventive care services, diagnosis and treatment tools, healthcare records management and patient care. Transportation systems can greatly benefit from big data to optimize route and schedules, accommodate for varying demands and being more environmentally friendly.

Deploying big data applications require the support of a good information and communication technology (ICT) infrastructure. ICT supports smart cities because it provides useful solutions and also unique solutions that may not be possible without it. For example, it enables efficient transport planning by providing easy ways to handle their services from different fields/locations to reduce transportation costs [11]. Other examples include providing better water management and improved waste management by applying innovations to effectively manage these services. For example, waste management includes waste collection, disposal, recycling, and recovery [12], all of which can be efficiently managed using ICT solutions. More examples include new construction and structural methods for the health of buildings and better environment; risk management; safety and security; air quality and pollution; public health; urban sprawl; bio-diversity loss; and energy efficiency. In general, a smart city can be made smarter when utilizing ICT and big data for many of its applications and services.

Adopting ICT, Cloud and big data solutions will help address many issues such as providing the storage and analysis tools. In addition this will help to reach the innovation stage [2] and encourage collaboration and communication between the different entities of a smart city. This can be done by building big data communities to work as one entity to foster collaborative and creative solutions addressing applications for areas like education, health, energy, law, manufacturing, environment, and safety. This also helps in real-time solutions to challenges in agriculture, transportation, and crowd management as applications and systems are integrated and information flows easily cross applications and entities [10]. There are many examples of big data applications serving smart cities such as:

  1. 1.

    Smart education [13]: ICT provides a solution to enhance the education processes’ efficiency, effectiveness, and productivity using education smart services that are flexible and intelligent to provide better use of information, enhanced control and assessment, higher support for life-long learning for all people (citizens and stakeholders). Smart education applications will engage people in active learning environments that allow them to adapt to the rapid changes of society and the environment. In addition, by relying on big data collected in the field and correctly processed to generate the required information, we will have a positive effect on the knowledge levels and teaching/learning tools to deliver or acquire knowledge. Furthermore, technology can make such opportunities available everywhere including remote or rural areas where commuting to schools may not be possible or the economic status of people is low and they cannot afford other more expensive models. Using ICT and big data will also help create a knowledge-based society, which will enhance the nation’s capability in competitiveness. Big data in education is generated mainly by collecting data on people (e.g. students, teachers, parents, administrators, and other support personnel), infrastructures (e.g. schools, libraries, computing facilities, educational locations, museums, universities, and other related entities), and information (e.g. courses, books, exams, grades, economic surveys, assessments, reports, and much more). This data can create a useful resource for analysis and extracting useful trends, models and using them to offer better and more enhanced education. As an example, big data supports educational organizations to personalize learning [14], “create communities of practice and standardize the presentation of knowledge” [15]. Big data in education can be also utilized to observe educational shortages to enhance study curriculums.

  2. 2.

    Smart traffic lights [16]: One of the main aspects of smart cities is a good control of the traffic flow within the city, which will enhance the transportation systems and improve the citizens’ commutes and the cities overall traffic patterns. When the population increases, traffic problems, pollution, and economic problems happen. Due to this, the use of smart traffic lights and signals is one of the most important techniques that smart cities use to deal with high volumes of traffic and congestions. Smart traffic lights and signals should be interconnected across the traffic grids to offer more information about traffic patterns. Each sensor detects a different parameter of the traffic flow (e.g. the speeds of cars, traffic density, waiting time at the lights, traffic jams, etc.). The system makes decisions according to the values of these parameters and gives the appropriate instructions to the lights and signals. Thus, the more data available to this system, the more informed decisions it will be able to make. As a result, to offer the best possible services in smart traffic lights, it will be best to collect data from all traffic lights across the city and build intelligent decision systems using this data. This requires the use of real-time big data analytics. As an example, implementing smart traffic lights and signals designed by the Traffic21 project in Pittsburgh, Pennsylvania, USA obtained significant results, which reduced traffic jams and waiting times resulting in reduced emissions by over 20 %.

  3. 3.

    Smart grid: The smart grid is an important component of a smart city. It is a renovated electrical grid system that uses information and communication technology to collect and act on available data, such as information about the behaviors of suppliers and consumers, in an automated fashion to add some values [17]. It improves the efficiency, reliability, economics, and sustainability of the production and distribution of electric power. A smart grid uses computer-based remote controls with two-way communication technology between power producers and consumers to increase grid efficiency and reliability through system self-monitoring and feedback. This involves placing smart sensors and meters on production, transmission, and distribution systems in addition to consumers access points to get granular near real-time data about the current power production, consumption, and faults. It implements dynamic pricing models for power usage to smooth out peaks by applying high charges during peak times and lower charges during other periods. This helps avoid potential power outages due to high consumer demands. It can provide consumers with near real-time information about their energy use and allow them to manage their usage based on both their needs and their affordable prices. Consumer devices such as washing machines and water heaters can be more cost-effective by controlling them automatically to operate during lower pricing periods. Although the smart grid has many potential benefits, it requires the collection of huge amount of data from power procedures, transmissions, distributors, and consumers [18]. In addition, it requires processing the collected data, which is considered big data analytics, in real-time to send back some control information to improve the overall performance of the electric power system [19].

We reviewed several examples of big data applications, which can be considered as guides to lead smart city applications development efforts. Many achieved various levels of success and most added valuable components to enhance smart city services and applications. Table 3 shows how cities around the world utilize applications of big data in different smart city components by implementing real smart city projects. Reviewing some of the actual implementations revealed that there are benefits of big data that reflect on smart city components. Table 4 summarizes these benefits within the different application domains used in smart cities.

Table 3 Examples of Big Data Projects in Smart City Components
Table 4 Benefits of Big Data in Smart City Components

4 Challenges

Many challenges face the design, development and deployment of big data applications for smart cities. Smart cities are considered very dynamic and evolving environments, thus it is important to avoid or at least reduce the challenges involved in smart applications design and development for smart cities. There are also some controversies related to the definition, use and benefits of big data for smart cities. These relate to available big data tools, real-time analytics, accuracy, representation, cost, and accessibility. Such issues can affect the performance of smart city applications and services relying on big data [8]. Is it possible that data is one of the challenges? How? Here we will address some of the key challenges in using big data in smart cities.

  • Data sources and characteristics: Data is generated from many different sources in many different formats. There are a lot of new data formats many of which are unstructured (e.g. images, audio, tweets, video, server logs, etc.). This data need to be managed and classified into a structured format using some form of advanced database systems [7]. Many identified different Vs of big data the most agreed upon are the 3 Vs: Velocity, Volume and Variety. Several more were added such as Validity, Veracity, Volatility and Value [20] as well as Variability [2]. Just trying to encompass these different attributes of big data generates very complex models and approaches and make it hard to manage. This is simply because the current methodologies or data mining software tools cannot handle the large size and complexity. In addition, there are some challenges that may be faced in the future, such as analytics architecture, evaluation, distributed mining, time evolving data, compression, visualization, and hidden big data [8]. When considering smart city applications utilizing big data difficulties arise in various areas. For one, collecting the data by itself is complicated by the existence of multiple sources with different formats and types and different usage and access policies. In addition, the unstructured nature of the data make it hard to categorize and organize and an easily accessible way for applications to use.

  • Data and information Sharing: Sharing data and information among different city departments is another challenge. Each government and city agency or department typically has its own warehouse or silo of confidential or public information. Most of which are often reluctant to share what might be considered proprietary data. In addition, some data may be governed by certain privacy conditions that make them hard to share across different entities. The challenge here is to make sure not to cross the fine line between collecting and using big data and ensuring citizens’ rights of privacy [20]. This is applicable within any smart city since there are many sectors and industries involved. Smart city applications will need to find ways to prevent or reduce the barriers to achieve seamless information sharing and exchange among different entities [21]. Furthermore, with multiple diverse data sources distributed among related departments, some data types such as spatio-temporal data can be updated quickly [21]. Therefore, it is difficult to create a unified understanding of data semantics, and extract new knowledge based on specific cycle data and real-time data. As result, it will be difficult to create a knowledge base for a smart city.

  • Data Quality: Looking at more fundamental aspects of big data, there are a number of challenges that are associated with the quality of the data. Data captured by different people under special regimes and stored in distinctive databases is rarely stored in any standard formats [22]. Relying on crowd sourcing and collaboration of multiple providers will result in data that suffers from a lack of structure and consequently consistency, heterogeneity, and disparity issues will have a greater chance to occur. Accordingly, “there is no universal way to retrieve and transform the data automatically and universally into a unified data source for useful analysis” [22]. That will cause more challenges like data uncertainty and trustworthiness. For example, sensor data collected through a third party without a centralized control could have been produced by sensors that are faulty, wrongly calibrated, or beyond their lifetime. The challenge may also extend to the outputs of analysing existing data (given the possibility of errors) and reporting the results for use by others, who may not be aware of such issues. Therefore, continuously updating data gathering and usage policies, sharing and discussing them among all entities in a smart city, ensuring that the citizens understand and apply the policies correctly is vital and challenging at the same time [10].

  • Security and privacy: Another one of the major challenges in a smart city and with using big data is the security and privacy issues. In basic terms this mean that databases may include confidential information related to the government and people, so they need high levels of security policies and mechanisms to protect this data against unauthorized use and malicious attacks. In addition, smart applications integrated together across agencies also require high security since the data will move over various types of networks, some of which may be pen or unsecure [20]. What makes such an issue more complex is that most big data technologies today, including Cassandra and Hadoop, suffer from a lack of sufficient security [23]. In addition to the need to secure data as it travels and as it is being used by the different components of smart city applications, there is also the need to clearly identify and protect privacy rights of organizations and individuals this data represents. Although specific smart city entities can claim ownership of most big data, a lot of it include personal and private information about individuals. Health and medical records, financial and bank records, retail history, and much more all provide intimate views of the people they represent. Many view access to this type of data as a violation of a person’s legal rights for privacy. Making sure that stringent privacy policies are put in place and properly enforced represents a major challenge for big data smart city applications developers and users.

  • Cost: Cost is a sensitive subject that involve the ways public authorities may affect people when they use ICT solutions. For example, using an energy usage reduction system [11], which forces the government to use new systems, components or features to monitor consumption and record information. This leads to creating a smart energy management system; however, it is also a very expensive to implement [16]. In addition if such a project is not implemented correctly from the beginning, it might cause a big problem, result in very high costs, and the city may be negatively affected. For example, the testing of a smart traffic light and signal system has a very high cost. These tests produce not only high costs in resources but also in traffic problems while physically deploying and testing the system [16]. Because of this, it will require replacing expensive hardware and software for further development and monitoring of smart city infrastructure and applications [11].

  • Smart City Population: People affect and are affected by the smart applications [11]. Particularly the city’s population size have a great effect on the size of big data. As the population grows, the size of generated data also rapidly grows and can become massive. This is one of the main challenges because the rapid growth will generate traffic congestion, pollution, and increasing social inequality [12] besides increased urbanization, which raises a variety of technical, social, economic, and organizational problems that tend to jeopardize the economic and environmental sustainability of cities [12]. As a result, smart city applications need to evolve quickly and extend efficiently to handle the growing volume and variety of big data to help avoid such problems. Ultimately, the goal is to develop and deploy smart city applications that are smart enough to evolve and intelligently handle the rapid growth of big data to generate better result.

As discussed above there are several facing smart city applications relying on big data. These challenges have varying effects and implications on such applications and pose varying levels of difficulty and complexity. Furthermore, different applications have different requirements for data usage. For example, traffic control requires immediate responses from the application to control traffic in real-time; while environmental sustainability applications may be able to handle more delayed responses as decisions are generally made over longer periods of time. Therefore, real-time transfer, discovery, analysis, decision-making, and responses is an issue; however, the degrees of its importance varies with the application [19]. More over achieving real-time responses depends heavily on how well we address the challenges we discussed above.

5 Requirements

This section will cover the key components required to design and implement smart city applications utilizing ICT and big data components. Data collection and capturing from sensors, users, electronic data readers and many others pose the first issue to handle as the volume rapidly grows. Storing, organizing and processing this data to generate useful results in the next issue. Fundamentally, to have effective solutions, it is required to select a number of design and development priorities in a planned manner, for example flexible design, quick deployment, achieving more thorough sense, more comprehensive interconnections, and more intelligence [24]. To further complicate the issues, handling interconnected communication infrastructures to access contextual information in smart city applications and physical spaces to support good decision making processes requires attention to various aspects of connectivity, security and privacy [2].

The applications of big data to smart cities can be classified into two types, offline big data applications and real-time big data applications. Real-time big data applications are different because they rely on instantaneous input and fast analysis to arrive at a decision or action within a short and very specific time line [19]. In many cases, if a decision cannot be made within that timeline, it becomes useless. As a result, it is important to make all data necessary for such decision available in a timely fashion and that the analysis is done in a fast and reliable way. As a result, real-time big data applications usually need higher technological requirements. Big data applications for smart city planning in areas like energy, traffic, education, and healthcare are considered offline. However, those needed to provide interactive actions, enhancements and controls for intelligent applications are real-time applications [19].

When considering smart city applications based on big data, it is necessary to address several requirements that stem from the special nature of smart city needs and big data characteristics. In this section we attempt to discuss several of these requirements to provide a general guideline for the design and development efforts. These requirements are identified based on the type of big data applications and the challenges of implementing these applications for smart cities. Some of these requirements are technological while others are related to citizens’ awareness and governments’ roles. Furthermore, some of these requirements are general and apply to any big data application, while others are specific to the special needs of smart city environments.

  • Big Data Management: The key advantage of smart city applications is that they generate large volumes of data in a variety of formats and from many sectors such as traffic, energy, education, and healthcare, and manufacturing. This data is generated and collected in massive amounts and on a regular basis, thus offering real-time view of what is happening in the city at any time. To ensure proper and useful utilization of this data in smart city applications, it is important to have suitable and effective big data management tools in place. Big data management includes development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs throughout its use in smart city applications. As the data comes from different sources with different formats, there is a need for advanced data management features that will lead to recognizing the different formats and sources of data, structuring, managing, classifying, and controlling all these types and structures. Big data management for smart city applications should also provide scalable handling for massive data to support offline applications as well as low latency processing to serve effectively in real-time applications. The concepts, techniques, and challenges of big data management are discussed further in [25, 26] and [27].

  • Big Data Processing Platforms: Big data applications for smart cities need to perform data analytics that usually require huge processing capability. This leads to the need for scalable and reliable software and hardware platforms. The software platforms for smart cities should offer high performance computing capabilities, be optimized for the hardware being used, is stable and reliable for the different data-intensive applications being executed, supports stream processing, provides a high-levels of fault resilience, and is supported by a well-trained and capable team and vendor. There are different available software platforms for big data analytics such as Hadoop Mapreduce [28], HPCC [29], Stratosphere [30], and IBM Infosphere Streams [31], which provide the stream processing required by real-time big data applications such as intelligent transportations in a smart city [19]. These platforms work well on cluster systems that can provide a powerful and scalable hardware platform to meet the requirements of big data applications for smart cities. Big data can be also processed on the Cloud using both big data Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) [32]. This will relieve the application owners from the Burdon of securing dedicated platforms, which is usually very costly and allow them to use well tested highly reliable platforms offered by the Cloud service providers.

  • Smart network infrastructure: Most big data applications for smart cities require to have smart networks connecting their components including residents’ equipment such as cars, smart house devices, and smart phones. This network should be capable of efficiently transferring collected data from their sources to where big data is collected, stored, and processed and to transfer responses back to the different entities that need them in the smart city. The quality of service (QoS) support in the network is extremely important for real-time big data applications for smart cities. In these applications, all current distributed application events should be transferred in real-time to where they can be processed. These events can be transferred from their sources as raw events or as filtered or aggregated events. All generated current row, filtered, and aggregated events can be transferred to a centralized processing point or to distributed intermediate processing points in the smart network for pre-processing or for further filtering and aggregation before being transferred to the main decision making unit. The centralized approach is good if the current generated events are not huge and there are no limitations on the network resources used to transfer these events. The distributed approach is more suitable for huge events such that it is inefficient and sometimes impossible to transfer all the generated events to a single location within acceptable performance and time bounds. Filtering and aggregation will become important in this case especially for smart cities as it can help reduce the amount of generated network traffic and speed up data processing. This can be done at the event sources and the intermediate points using an open-loop or a closed-loop approach. In open-loop approach filtering and aggregation policies are pre-defined while in closed-loop approach filtering and aggregation policies are interactively defined based on the current events and decisions, current system and network resources, or external smart city application policies. In both approaches, event filtering and aggregation should be done without compromising the integrity, accuracy and correctness of the data being aggregated. This is important to preserve the quality of the decision making process in the real-time big-data applications [19].

  • Advanced Algorithms: Standard algorithms used in regular applications may not be sufficient or efficient enough to handle big data applications due to their unique requirements and pressing need for high volume high speed processing. For example, most available data mining algorithms are not very suitable for big data mining applications as their design is based on limited and well defined data sets [33]. Big data applications for smart cities will need to implement advanced and more sophisticated algorithms to deal with big data efficiently. Some of these algorithms need to be designed for real-time application support while others can be designed for batch or offline processing. These algorithms need to be optimized to handle high data volumes, large variety of data types, time constraints on decision making processes, and distributed components across various geographical locations. In addition, these algorithms need to work effectively across heterogeneous environments and be capable of managing and operating in highly dynamic environments.

  • Open Standard Technology: As big data smart city applications involve large scale heterogeneous systems and data, it is advantageous to follow an open standard for designing and implementing such solutions. This will add flexibility for upgrading, maintaining, and adding more application features for smart cities. In addition, this will facilitate the integration among smart city components and big data components. In addition, it is primary to set standard rules for new applications to achieve easy integration between the available smart city infrastructure and environment and the introduced big data applications. This can be achieved by performing a full study of the government entities, stakeholder, and the infrastructure to assess the readiness to be part of a future smart city [10]. Based on such study, regulations, standard models of design and rules can be developed for big data applications development for the smart city.

  • Security and Privacy: Given that most data collected and processed in smart city applications will contain some form of sensitive or private information, it is important to ensure that all technology and applications components include and maintain acceptable levels of security and privacy mechanisms. Although a smart city provides many positive advantages for its residents, it also poses several threats to their safety, wellbeing and privacy by relying heavily on their data. The possibility of illegal access or malicious attacks to such infrastructures can lead to catastrophic results affecting the city infrastructure, its government entities and its residents. Big data applications designers and developers must include security and privacy policies and procedures as an integral part of the design and implementation of their applications. Clear guidelines and requirements must be identified from the various users to be enforced in the applications.

  • Citizen Awareness: Citizens must be aware of how to use ICT solutions for smart city correctly and safely. Their active participation in providing information related to the different issues they may encounter with smart city applications will help in enhancing the quality of collected data and the performance of the applications. As a result, more effective decisions can be made from collected big data to enhance different smart city components. Another important aspect in citizen awareness is their knowledge and practice of good safety, security and privacy practices. Adequate training and awareness campaigns need to be done to make sure that people are aware and capable of protecting their own data and environment.

  • Government Role: Governing entities of smart cities must establish guiding principles of openness, transparency, participation, and collaboration to keep the exchange and flow of big data under control [10]. Governments play an essential role in a smart city; therefore, it is required to have advanced systems to manage big data collected and used by government entities. In addition, the government must review and recalibrate information and data policies as necessary by focusing on privacy, data reuse, data accuracy, data access, archiving, and preservation [10]. Therefore, it must have well-defined data documentation and codebooks to ensure informed use of the datasets [10]. To effectively support big data applications, smart city government should balance the beneficial uses of data against individuals’ privacy concerns by addressing some of the fundamental concepts of privacy laws. This includes defining “personally identifiable information”, and the role of individual control [34].

Along with these general non-functional requirements for big data applications, each application will also have its own set of functional and operational requirements. These requirements are gathered and analyzed when the application is being considered for development in the smart city. Together the two sets of requirements should fully define all the necessary requirement and resources to successfully design, develop test and deploy the required application. As the different requirements for big data smart city applications are gathered, it may be also helpful to use simulations to help improve and predict the outcomes of such applications. Simulation techniques offer a different more realistic view of how the applications may behave and hat the expected outcomes will be. This approach helps reduce a system’s cost in the implementation and testing phases and in optimize the required resources for the project. Examples of such techniques are accelerated-time simulations of traffic flow (ATISMART model) that give the users a chance to interact easily, as the Graphical User Interface (GUI) allows the system to be dynamic and flexible as well as reducing the cost of implementing traffic lights and signals [16].

6 Discussion and open issues

Despite the prevalence of the smart city phenomena worldwide, there is obscurity facing its definition. The general perception currently is “I know it when I see it”, which implies some known characteristics that can be recognized in a smart city, yet, they are still not well defined. Yet there seems to be an agreement on what a smart city will achieve to its citizens and the environment. In general a smart city will improve governance, enhance the economic standing of the city, improve the quality of life of its citizens, and help create an environmentally friendly and sustainable infrastructures. This has led to highlighting several common characteristics, features and components that may specify the perspectives of a smart city. These include the intensive use of ICT and next generation information technology, this integration of the physical and social components of the city via the use of ICT, implementing advanced monitoring and control tools and applications to enhance efficiency and quality, and improving the infrastructures to support better quality of life and higher sustainability.

These aspects affect each smart city proposal regardless of its size. In general, governments around the world are also concerned about the cost and benefits of implementing a smart city. Many worry about the financial patterns, available resources levels, and their capabilities regarding regulation systems as they pose challenges to tackle. Conversely new technologies can help change mitigate some of the challenges and offer more opportunities for success. In addition, there is a huge potential for using big data to address many of the issues involved in smart cities using analytics for deeper insights and better decision making practices. Furthermore, the cloud offers additional opportunities to implement and deploy ICT solutions for smart cities and support collaboration between different applications in a smart city. The vast advances ICT, the Cloud, information technology, and big data offer cities more capabilities to be smarter than was ever possible just a short time ago.

Since big data is viewed as a strong enabler for smart city applications, we studied and compared its different definitions earlier. The various Vs of big data show how complex and difficult it is to collect, manage, store, and analyze big data. However, the sheer volume and variety of big data offer a great opportunity to create smart applications that respond effectively to current data and offer accurate tools for decision making. Including big data applications to support smart cities is not without challenges; however, successful implementations will take propel a city far ahead in terms of how smart it is. With this visionary technology availability, multiple countries around the world like South Korea, the US, and the UAE are encouraged to build and support smart cities.

Understanding the characteristics of smart cities and acknowledging the need for advanced big data and ICT support facilitates the process of putting all these technologies together to start building smart city applications. Policy makers can now explore how to plan and construct smart cities. These can be viewed in three categories: “public infrastructure, construction of public platform for smart city, and construction of application systems” [21]. Each one of these categories involve issues and challenges that can be considered future study fields. As plans progress and more research and development efforts are poured into smart city design, many of the issues and challenges will be addressed and solutions will be reached. As a result more cities will start to become smarter and the overall quality of life will improve.

In addition, it is essential to have clear, reliable strategic plans for smart cities that go beyond piecemeal initiates or stand-alone projects. Such plans must consider the various smart city requirements (physical social and technological) into account and avoid treating each part as its own silo. The holistic approach will help give a better view of what is needed and ill lead to a more rounded, better designed complete solutions for smart cities rather than islands of independent components and applications that could hardly recognize or connect with each other. Therefore the efforts should concentrate on creating a roadmap for success that covers several stages:

  1. 1.

    Set up the smart city’s direction by identifying its mission, vision and strategic and operational objectives.

  2. 2.

    Establish policies, principles, resources and expertise guidelines to control ICT and big data usage.

  3. 3.

    Build smart-ready public infrastructures and platforms including the ICT required to support smart city applications. This will involve evaluating and analyzing the current situations and the necessary changes and additions to reach the desired result.

  4. 4.

    Identify priorities and use them to determine the most important smart city components and applications that would offer the greatest effects with the smallest investment.

  5. 5.

    Integrate infrastructures, services and big data smart city applications to develop better and more efficient citizen experiences.

  6. 6.

    Optimize smart city services and operations using the collected data and the smart applications to enhance services and identify infrastructure and environmental improvements needs.

  7. 7.

    Realize new opportunities for further development by monitoring current developments and their effects and the arising issues and new requirements.

Clearly the use of ICT and information technology including big data will provide numerous opportunities to build smart city applications that will effectively and efficiently cater for the needs of the various entities living in and using it. Therefore, it is necessary t include enough resources and finance to support the applications development efforts throughout the various stages of smart city development. This investment is essential to reap the full benefits of smart cities and realize all the envisioned features and capabilities. To help optimize the work and minimize costs of such projects it is recommended to include some of the following activities in the process:

  1. 1.

    Developing simulation systems to help predict and view possible changes and forecast potential problems. This will help avoid or at least reduce some of the risks involved and in many cases also help reduce implementation and testing costs.

  2. 2.

    Benefitting from other smart city experiences to follow successful models and avoid problematic approaches.

  3. 3.

    Benefitting from experts and researchers to study available market systems (smart systems/services, data systems) and also research new possibilities for more advanced systems that suite the smart city and objectives.

  4. 4.

    Investigating the correlation between big data and smart city applications. This understanding will help include the right data into the right applications to reach better decisions and optimize various functions in the smart city. Gartner, as an example, provides a simple diagram illustrating the values of such studies (see Fig. 3) [35].

    Fig. 3
    figure 3

    Gartner’s studies of how data enables accurate decision-making to be smarter city [35]

As we end this discussion, we can affirm to how vital big data is for smart city applications. We have shown several examples of using big data and the benefits of doing so. However, to effectively use big data for smart city applications, there are some open issues that need to be addressed and resolved. Several of these open issues stem from the different challenges we discussed earlier, while some may relate to other aspects we did not consider. Yet many of these open issues are currently under scrutiny and investigation by industry and research communities. However, no full solutions are offered and there is always room for improvements and innovations in this field. Some of these open issues include, but are not limited to the following:

  1. 1.

    Is Social Media an important data source in smart cities and how communication will look like between governments, citizens, and businesses? When everything is connected and integrated, should all entities public and private have access and rights to the same information and knowledge?

  2. 2.

    Security and privacy issues are another important issue to be carefully considered. When all systems are integrated, data will be shared among all entities in the smart city. Therefore, the infrastructure and platforms must be secured, privacy must be preserved and information must be fully protected.

  3. 3.

    The political considerations and effects on any city play a role on how we (or not) it will perform and that also applies too smart cities. The privilege of access to information by different people in different power or political positions must be taken in consideration and addressed carefully.

  4. 4.

    The side effects of using technology is another issue to study. Since we will have a communication infrastructure that spans private and public networks many of which may be wireless we must consider all the possible risks and consequences of their use. In addition, many devices owned and operated by different people for various purposes and in so many different level of experience with ICT will be no board. It is generally unknown how this level of interaction with technology will affect the users and whether there will be negative effects on them. For example, many talk about the harmful effects of having cell phones nearby for extended periods of time, thus it is also logical to question the effects of all these technologies being included smart city citizens’ lives.

  5. 5.

    The need for highly educated well qualified people to design, develop, deploy and operate smart city infrastructures, platforms and applications is growing rapidly. Specialized education and training in these field need to be developed and offered to create this type of workforce.

  6. 6.

    There is also the need to set common measurements and control policies for smart applications. Monitoring and control of initiatives and implementations using different tools and techniques is required in a smart city to ensure the correctness, effectiveness and quality f deployed smart city applications.

7 Conclusion

Smart city and big data are two modern and important concepts; therefore, many started integrating them to develop smart city applications that will help reach sustainability, better resilience, effective governance, enhanced quality of life, and intelligent management of smart city resources. Our study explored both concepts and their different definitions and we came to identify some common attributes for each. Despite the varying definitions each concept has a number of characteristics that uniquely defines it. Relying on these common characteristics, we were able to identify the general benefits of using big data to design and support smart city applications.

From there, we discussed the various opportunities available and this will result in building smart applications capable of utilizing all available data to enhance their operations and outcomes. We also discussed the various challenges in this domain and identified several issues that may hinder big data applications development efforts. Based on that discussion, we suggested a list of general requirements for big data smart city applications. There requirements are necessary to design and implement effective and efficient applications. In addition, these requirements also try to address the challenges and propose different ways to resolve some of the issues and generate better results. Finally we discussed some of the main open issues that need to be further investigated and addressed to reach a more comprehensive view of smart cities and develop hem in a holistic well thought out model.

Building and deploying successful big data smart city applications will require addressing the challenges and open issues, following rigorous design and development models, having well trained human resources, utilizing simulation models and being ell prepared and well supported by the governing entities. With all success factors in place and better understanding of the concepts, making a city smart will be possible and further enhancing it for smarter models and services will be an attainable and sustainable goal.


  1. Pantelis K, Aija L. Understanding the value of (big) data. In Big Data, 2013 IEEE International Conference on IEEE; 2013. pp. 38–42.

  2. Khan Z, Anjum A, Kiani SL. Cloud Based Big Data Analytics for Smart Future Cities. In Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing. IEEE Computer Society; 2013. pp. 381–386.

  3. Kitchin R. The real-time city? Big data and smart urbanism. GeoJournal. 2014;79(1):1–14.

    Article  Google Scholar 

  4. Townsend AM 2013. Smart cities: big data, civic hackers, and the quest for a new utopia. WW Norton & Company.

  5. Batty M. Big data, smart cities and city planning. Dialogues Hum Geog. 2013;3(3):274–9.

    Article  MathSciNet  Google Scholar 

  6. Vilajosana I, Llosa J, Martinez B, Domingo-Prieto M, Angles A, Vilajosana X. Bootstrapping smart cities through a self-sustainable model based on big data flows. Commun Mag, IEEE. 2013;51(6):128–34.

    Article  Google Scholar 

  7. Michalik P, Stofa J, Zolotova I. Concept definition for Big Data architecture in the education system. In Applied Machine Intelligence and Informatics (SAMI), 2014 IEEE 12th International Symposium on 2014. pp. 331–334.

  8. Fan W, Bifet A. Mining big data: current status, and forecast to the future. ACM SIGKDD Explor Newsl. 2013;14(2):1–5.

    Article  MATH  Google Scholar 

  9. Al-Hader M, Rodzi A. The smart city infrastructure development & monitoring. Theor Empir Res Urban Manage. 2009;4(2):87–94.

    Google Scholar 

  10. Bertot JC, Choi H. Big data and e-government: issues, policies, and recommendations. In Proceedings of the 14th Annual International Conference on Digital Government Research. ACM; 2013. pp. 1–10.

  11. Kramers A, Höjer M, Lövehagen N, Wangel J. Smart sustainable cities–Exploring ICT solutions for reduced energy use in cities. Environ Model Software. 2014;56:52–62.

    Article  Google Scholar 

  12. Neirotti P, De Marco A, Cagliano AC, Mangano G, Scorrano F. Current trends in Smart City initiatives: Some stylised facts. Cities. 2014;38:25–36.

    Article  Google Scholar 

  13. Tantatsanawong P, Kawtrakul A, Lertwipatrakul W. Enabling future education with smart services. In SRII Global Conference (SRII), 2011 Annual IEEE; 2011. pp. 550–556.

  14. West DM. Big Data for Education: Data Mining, Data Analytics, and Web Dashboards. Governance Studies at Brookings. 2012. Available at

  15. Marsh O, Maurov-Horvat L, Stevenson O. Big Data and Education: What’s the Big Idea?. UCL Policy Briefing. 2014. Available at

  16. Aguilera G, Galan JL, Campos JC, Rodríguez P. An Accelerated-Time Simulation for Traffic Flow in a Smart City. FEMTEC. 2013;2013:26.

    Google Scholar 

  17. U.S. Department of Energy, “Smart Grid / Department of Energy,” Web:, RetrievedSep. 23, 2015.

  18. Yin J, Sharma P, Gorton I, Akyoli, B. Large-Scale Data Challenges in Future Power Grids. In Service Oriented System Engineering (SOSE), 2013 IEEE 7th International Symposium on IEEE; 2013. pp. 324–328.

  19. Mohamed N, Al-Jaroodi J, “Real-time big data analytics: Applications and challenges,” High Performance Computing & Simulation (HPCS), 2014 International Conference on, vol., no., 2014. pp. 305,310.

  20. Khan M, Uddin MF, Gupta N. Seven V’s of Big Data understanding Big Data to extract value. In American Society for Engineering Education (ASEE Zone 1), 2014 Zone 1 Conference of the IEEE; 2014. pp. 1–5.

  21. Su K, Li J, Fu H. Smart city and the applications. In Electronics, Communications and Control (ICECC), 2011 International Conference on IEEE; 2011. pp. 1028–1031.

  22. Lee CH, Birch D, Wu C, Silva D, Tsinalis O, Li Y, Guo Y. Building a generic platform for big sensor data application. In Big Data, 2013 IEEE International Conference on IEEE; 2013. pp. 94–102.

  23. Kim GH, Trimi S, Chung JH. Big-data applications in the government sector. Commun ACM. 2014;57(3):78–85.

    Article  Google Scholar 

  24. Chourabi H, Nam T, Walker S, Gil-Garcia JR, Mellouli S, Nahon K, Scholl HJ. Understanding smart cities: An integrative framework. In System Science (HICSS), 2012 45th Hawaii International Conference on IEEE; 2012. pp. 2289–2297.

  25. Xiaofeng M, Xiang C. Big data management: concepts, techniques and challenges [J]. J Comput Res Dev. 2013;1:98.

    Google Scholar 

  26. Borkar V, Carey MJ, Li C. Inside Big Data management: ogres, onions, or parfaits?. In Proceedings of the 15th International Conference on Extending Database Technology. ACM; 2012. pp. 3–14.

  27. Chaudhuri S. What next?: a half-dozen data management research goals for big data and the cloud. In Proceedings of the 31st symposium on Principles of Database Systems. ACM; 2012. pp. 1–4.

  28. Dittrich J, Quiané-Ruiz JA. Efficient big data processing in Hadoop MapReduce. Proc VLDB Endowment. 2012;5(12):2014–5.

    Article  Google Scholar 

  29. Middleton A, Solutions PDLR. Hpcc systems: Introduction to hpcc (high-performance computing cluster). White paper, LexisNexis Risk Solutions; 2011.

  30. Alexandrov A, Bergmann R, Ewen S, Freytag JC, Hueske F, Heise A, et al. The Stratosphere platform for big data analytics. VLDB J. 2014;23(6):939–64.

    Article  Google Scholar 

  31. Biem A, Bouillet E, Feng H, Ranganathan A, Riabov A, Verscheure O, Moran C. Ibminfosphere streams for scalable, real-time, intelligent transportation services. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data ACM; 2010. pp. 1093–1104.

  32. Ji C, Li Y, Qiu W, Awada U, Li K. Big data processing in cloud computing environments. In Pervasive Systems, Algorithms and Networks (ISPAN), 2012 12th International Symposium on IEEE; 2012. pp. 17–23.

  33. Wu X, Zhu X, Wu GQ, Ding W. Data mining with big data. IEEE Trans Knowl Data Eng. 2014;26(1):97–107.

    Article  Google Scholar 

  34. Tene O, Polonetsky J. Big data for all: Privacy and user control in the age of analytics. Nw J Tech Intell Prop. 2012;11:xxvii.

    Google Scholar 

  35. Business analytics from basics to value, Gartner, Retrieved 4 May 15, Published on Jun 10, 2014, available at

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nader Mohamed.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al Nuaimi, E., Al Neyadi, H., Mohamed, N. et al. Applications of big data to smart cities. J Internet Serv Appl 6, 25 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: