Organizations that use visual data discovery are more likely to find the information they need, when they need it and do so more productively than other companies.
The amount of data being created by people, machines, IOT devices, and other sources continues to grow by leaps and bounds. On an average, the world produces roughly around 2.5 exabytes of data per day which means by the year 2020, we would have generated a cool 44 zettabytes of raw data. While there’s no doubt that organizations are collecting and generating more information than ever before, simply having a lot of data does not make a business data driven.
Organizations need to effectively leverage their information to make decisions and drive new initiatives. Since data sources exist in silos, it’s often impossible to get a complete picture of an organization’s data assets, and a lack of governance means data must be cleansed and standardized before it can be of any use. Hence data must be gathered, organized, made interpretable, and then analyzed and acted upon to provide any meaningful value.
Enter data visualization – Data visualization is all about representation of these numbers in a compelling format. Using a variety of formats- points, lines, shapes, digits, letters to simplify data, promote understanding, and communicate important concepts and ideas is the goal of data visualization. Data visualizationabstracts data into a schematic form so that the human mind can easily process chunks of repetitive information and gain clear understanding without much effort.
A common misconception is that data visualization and reporting are one and the same. However, this is not the case, as data visualization emphasizes easy interpretation and allows decision makers to see connections between multi-dimensional data sets, and provides new ways to interpret data through the use of heat maps, fever charts, and other rich graphical representations, rather than relying on static tables and charts that fail to paint a clear picture.
Apart from providing a clear representation of data, it offers much more. Here are some of the benefits of implementing data visualization tools into your business
The human brain is wired in such a way that visualizing is proven to be the single easiest way for us to receive and interpret large amounts of information intuitively, without deep technical expertise. Apart from gaining insight into existing data, advanced data visualizations can enable even a novice to forecast future trends. A survey carried out by Aberdeen Group discovered that managers who use data visualization tools are 28% more likely to find timely information than those who rely solely on managed reporting and dashboards.
The top most advantage of presenting data in a visual manner is that it enables users to more effectively discern connections between operating conditions and business performance. Discovering these correlations can directly impact business practices.
Direct interaction with data
The greatest strength however, in the adaptation of data visualization is its ability to bring actionable insights to the forefront. The capability it provides to the users to interact directly with the data is one of the main reasons why it is viewed as being far more superior to one-dimensional tables or charts. This real-time data visualization combined with predictive analysis can lead to effective actions with the added luxury of catching snags in the system on time.
Staying up to date
The vast volume of data that is gathered by a company can provide business leaders and decision makers with valuable insights into business opportunities. Sudden shifts in customer behavior or gradual changes in market conditions are tell-tale signs that can only be identified by carefully perusing multiple data sets. Access to such insights enables the company to act on new business opportunities ahead of its rivals.
Storytelling through data
A visual data tool such as a heat map for instance can easily illustrate areas of your business that are doing exceptionally well or underperforming. This information in turn can develop a new business language, one that elucidates the inner workings of the business. Putting forth this data analysis to executives can open up new ways to look at existing operations, thus giving businesses a tool to scale new heights in performance
Data visualization is no longer the next big thing, but is a mandatory go-to tool for every business. This crucial addition is going to change the way analysts work with data. Data visualization promotes creative data exploration, wherein those interacting with data can easily recognize and respond to business changes more rapidly than ever before.
The growth in data complexity and the high number of disparate datasets processed by organizations today make it imperative to choose wisely when it comes to gathering, storing, and analyzing data. Traditional approach to analytics and reporting has been to use data warehouses which store structured data for analytic processing. However, today we see a large volume and variety of data from disparate sources like web sites, social media, Mobile and IOT devices. This need to process ‘Big Data’ has led to the creation of Data Lakes.
What are the features that set Data lakes apart from data warehouses, and which of these is the right solution for your business needs? This post attempts to demystify the underlying concepts behind the two.
By definition, a data lake is meant to hold massive quantities of unstructured data which remains undefined until the data is extracted. Whereas, data warehouses are more of a schema on write system, one that is optimized for analytic processing, instead of transaction handling. Here are a few differences
In the case of data warehouses, a large amount of time is spent in analyzing, processing and profiling data. This is done mainly to simplify the model, conserve space and lower costs. This compromise leads to the exclusion of a large amount of data simply because it does not answer a specific question or finds place in a defined report.
A data lake retains all sorts of data whether structured, semi structured or non-structured, in large amounts. This is possible due to the difference in hardware between the two. Off-the-shelf servers combined with economical storage options make scalability a non-issue.
With data warehouses, the data consists of only quantitative metrics and the attributes that describe them. Of course, new types of data are being introduced but storing them in these warehouses is both expensive and difficult.
Data lakes on the other hand, embrace non-traditional types of data – such as web server logs, sensor data, social network activity, images, etc. All sorts of data, regardless of their source or structure are kept in their raw form, ready to be transformed when the need arises.
Data warehouses function like a repository, which means changing its structure is not an arduous task but definitely time consuming. Even a well-designed warehouse that is highly adaptable to change suffers because of the complexity of the data loading process rendering the system resourceheavy and slower to respond. The ever-increasing need for faster answers is what has given rise to the concept of self-service business intelligence.
A data lake, however does not possess a structure and since data is always accessible, developers and data scientist are empowered to easily configure and reconfigure models and apps on the fly. Constant changes – whether deletions or additions can be carried out without changing any structure or employing additional resources.
Larger user base
In an organization, there are three main sets of data users – Operational, analytic and modelers. Operational users make up a huge chunk of data users, often taking up around 80% of user base. They are confined to using data for mundane tasks such as viewing reports, key performance metrics, editing data on spreadsheets, etc. Analytic users, forming 10%, use the data warehouse as a source but often require additional data which is stored outside the organization. Unlike operational users, analytic users are required to go beyond data warehouses. Modelers differ greatly from these two – they delve into deeper analysis. This requires far more than what a data warehouse could possibly offer. Most often, these data scientists tend to surpass data warehouses and use advanced analytic tools and capabilities like statistical analysis and predictive modeling.
A data lake approach supports all three of these users equally well. The data scientists can go to the lake and work with the very large and varied data sets they need while other users make use of more structured views of the data provided for their use.
Since data lakes contain all sorts of data types in raw form, it enables users to get to their results faster than the traditional data warehouse approach. However, this promptness does come at a cost where the user is expected to transform, cleanse and structure the data as they see fit. This approach may mainly work with the aforementioned analytic and modeler type of users as operational users may not possess the capabilities or the acumen to deal with such large amounts of unstructured data.
While it may seem that a data lake approach is the way to go, you don’t necessarily have to do away with your current warehouse solution. It is important to note that the data lake is not a better version of a warehouse, nor is it a replacement. They are both optimized for different purposes, and the goal is to use each one for what they were designed to do. It is quite possible to use both in tandem
A data lake can be used for sandboxing, allowing users to experiment with different data models and transformations, before setting up new schema in a data warehouse or could also serve as a staging area, where data is supplied to a data warehouse.
By using both prospects suitably, it is possible to find the perfect solution to your data storage needs.
It’s a SaaS world today. Most of the traditional software development involved installation of separate software product instances for each of the customers. Software providers would install the software packages developed (or configured) for specific clients. Over the last few years the world has started shifting from in-house infrastructure to cloud based solutions. Software products have since started becoming what the industry calls “Multi-Tenant Applications”.
Multi-Tenancy is a way of architecting and developing software products with one source code base, one server instance.. etc. Multi Tenancy is achieved using the various frameworks, and concepts like convention over configuration, AOP based programming etc. Typically any application developed with aspects of multi-tenancy can also be deployed on-premise or on collocated servers.
In this blog we discuss the various architectural options for Multi-tenancy and the key considerations for choosing one over the other
When data segregation is Key
Multi-tenancy applications are developed with single instance of code and single instance of database when segregation of data is critical.
We can take the bill payment application of a financial services provider as an example. A Bill Payment use case with multiple third party services for payment gateway interactions is shown below. The application interface abstracted through the business driven design and domain specific language that it could make use of any specific interfaces like, bill desk, visa etc depending on the choice made by tenant. This is segregated based on the tenant key. The technical customization in terms of themes, color combination etc. are also made based on the tenant key being shared.
This type of architecture makes implementing new features very easy as large logic changes can be ring-fenced within specific component and hence there is less maintenance.
When ease of licencing is key
Single Database type of multi-tenant architecture is preferred when the customer is ok to keep the data outside the premises and data security is not a major concern. However, the license and maintenance cost needs to be minimal.
A single database is created, and all data is stored here. Each record is assigned a tenant key, and only data belonging to that tenant is accessible. Access is restricted by the application software.
The single database instance approach has low cost since the cost of maintenance is not transferred to tenant. This model gives the benefit to tenant to bring up / shut down the application with various licensing modules.
Data consistency, integrity and availability is provided by the company developing the application and the Cloud based infrastructure makes the sizing and scaling easier.
When Security is Key
Multiple Databases – Alternatively, a separate database can be used to store each customer’s data. Access to the database is then restricted by using SQL login credentials.
The multiple databases approach used again uses the same tenant key however the tenant itself will be only one in the system. In very specific cases with very high data security applications or to meet the tenant expectations the multiple database approach could be used but with a single code instance.
The multiple database instance approach provides complete data integrity and segregation of the tenant data. Customization is easier as each tenant uses a different database instance.
The modern day products specifically serving multiple tenants are developed typically considering the multi tenant architecture. There are two areas where decision path fork out which entirely depends on the security of data over licensing. The various deployment architecture is considered. The key point which is central to this is tenant data.
The configurations and the deployment of the application is typically done using the automated testing and the deployment mechanisms.
A case in point:
We recently developed two applications, one for a mortgage boutique shop and another alternate financial service provider on their endeavor of building a product for their end clients. The usage of the multi-tenant application was such an essential part of the exercise in high level design that the various UI, product line and other services were customized to its fullest possible extent using the multi-tenant architecture.
We went through the above decision making process with the client, worked with client’s business teams to understand what were the Key considerations – Security, Licensing or data segregation. When Security considerations won, we designed and built the product with a multiple database architecture.