CONTACT US

If you have a query, please let us know, by sending an email to info@reflectionsit.in.
INDIA:+91 471 4010 301

OFFICE HOURS

Mon-Fri 9:00AM - 6:00PM
Sat & Sun by appointment only!

CREATE ACCOUNT

FORGOT YOUR DETAILS?

Reflect India

This solution aims to create a powerful decision support system with rich data visualizations. It does this by identifying relations between various data sources available in data.gov.in and specialized data marts are created so as to enable users to make informed decisions.

This solution is designed to inspire users to put on the thinking hat and provoke creativity by providing greater in-depth to the analysis that can be done over government data.

Technical Overview

The solution shall build an Operation Data Store, DW and Data Marts which pulls data from data.gov.in datasets. Data which is received as csv format will be Staged to a SQL Server Database, SQL Server Integration Services packages will be used for performing the ETL operations.
Data Quality Rules will be defined for domains using Data Quality Services. SSIS packages shall use the domain rules defined in data quality services to quality check and cleanse the data. Kimball Data Quality Screen process will be implemented to log Quality of data and data quality reports would be generated using Reporting services.

The solution shall provide a Master Data Management tool using Master Data Services which the Government authorized members can edit. For e.g. Regions within a country is managed by Government of India – in the recent case of Telengana becoming a state, an authorized person can use the MDM interface to create an entry in State Attribute of the MDM entity Region.

The ODS shall be modelled in our solution, i.e. Region shall be one entity of the ODS, The other Hierarchies in the Entity shall be defined, say country -> State-> District -> Corporation etc. When the data is received from different sources it shall go though ETL process which loads the data to these entities. The ODS shall be exposed as an Open Data through REST APIs which public can access (read only), and authorized people can write too.

Data warehouse shall be modelled following top-down approach. The data from the ODS gets moved to a Data Warehouse. Then Data Marts shall be created for each sections. For Example trade analysis shall be a Data mart that is provided for accessing trade data based on data from the Ministry of Commerce. Accident Analysis is another Data Mart that is build based on the data from Mistry of Road Transport and so forth.

The public can access the data using a Portal which will provide rich Data visualization categorized by subject area – Health, Road Accidents, Trade, Agriculture etc.

data-services

User scenarios

Government/NGO/public – The solution will enable public user to visualize and analyze open data using data marts designed for each subject areas

Government/Government Statistical Departments – Improve the quality of data by setting rules for data cleansing with the help of our technical team.

Media/ Internet Feeds – The solution will implement an operational data store with data retrieved from various datasets, A RESTFUL Web service will be built over this ODS that will provide data in a XML/JSON format that the third party could interpret and present is a different format, for e.g. Charts

Supported Devices

All devices supported by Power BI tools – Windows mobile primarily

SharePoint 2013 Website shall be accessible from mobile browsers

Technology stack

Data Loading from Sources / ETL– SQL Server Integration Services 2012

Data Cleanse and Quality control – Data Quality Services

Master Data Management – Master Data Services

Data Storage – SQL Server 2012

Data Marts – SSAS Tabular and Multidimensional

Portal – SharePoint 2013

Data Visualization – Power View / Reporting Services

Screen shots of the solution

Welcome screen

This is the login page. Users can log in to the system and based on the role – Citizen or Government appropriate modules to access will be shown

home

Best place to live analysis

This is one of the sample pre-defined analysis available in the system available for public as well as government use.

Best place to live analysis

Government user landing page

gov-user-after-sign-in

Upload Data

upload-data

Staging project

Reflections follow a standard architecture in ETL process from staging through the data-mart. We have a well-defined format for our SSIS packages. As shown below, data load packages ideally will have 4 steps –

1. Capture the audit log – what package is being loaded, the start time of the package

2. Delete existing data in the table (staging only)

3. Use data flow transformations to load new data

4. Complete the package by another audit task that capture – End time of the package and number of records loaded

figure1

Quality

The data in staging is not strongly types. Staging tables will have nvarchar fields so as to accommodate all incoming data during data load. This data will be loaded to the Quality DB by type casting the data. Errors during type casting or data insert to Quality DB will be pushed to the Bad DB. Correct data will be send to Quality DB for data quality and cleansing activities.

Quality

Cleanse

Data from the Quality DB is passed through a Data Quality Services Transformation to apply cleansing rules to identify the valid/invalid data and correct the invalid data if possible. Valid/Corrected data moves to Clean DB and the invalid data is pushed to Bad DB

figure3

DQS Transformation

In the DQS transformation, we have to map the incoming data fields to an already created domain in DQS client. The rules defined in the domain will then be applied to the data field.

DQS Transformation

DQS Client

In the data quality services client, we need to create a knowledge base and then a domain. The domain will have the set of rules defined. This will be published to a knowledge base to validate the data. Then with existing data we can perform a knowledge discovery to identify the new data in the database. Based on this data, the rules can be modified and republished.

DQS Client

DQS Domain Rules

This is where we can specify the valid and invalid values for a domain. We can add multiple rules for a single domain. Once the domain is created, it can be published.

DQS Domain Rules

DQS Domain Values

Once we have the domain rules created, we can identify the domain values and suggest corrections to data. Output of data validated using these rules will have – valid values, corrected values and the invalid data. The valid/corrected data will be sent to a Clean DB, then to datawarehouse and finally to specialized data marts for further analysis. The bad data will be pushed to Bad DB. A data quality report will be generated based on Bad DB statistics. This is explained further below.

DQS Domain Values

Master Data Services

We use Master Data Services as our master data management tool. Reflections will define the master entities. Authorized government personnel can add/update the master data. This will ensure that the same golden record is available for all the government data that this system process.

Master Data Service

MDS – Explorer

Authorized government users can add or edit data to in master tables.

MDS – Explorer

MDS – Manage users

Reflections can add users to the MDS system using the security settings of MDS. User added can be validated against the active directory. User groups can be added as well.

MDS – Manage users

MDS – Manage permissions

We can set the permissions for the users/groups in MDS at various levels. The security settings can be applied to a model, entity, or at an attribute level. This means a user can be granted access to entire database or to a single column in a table. The same applies in the case of access denial as well.

MDS – Manage permissions

The ETL Framework

ETL Framework

Data Quality Reports

These reports help understand the quality of data over time. This is help in bringing down data quality issue and hence raise the standards of data being fed to the system

dataqualityreports

Other specialized analysis – Accident Analysis

accidentanalysis

Other specialized analysis – Industrial GDP Analysis

industrial-gdp-analysis

CODE for HONOR Official Website

TOP