A Big Data strategy must include criteria to evaluate:
- What problems the organization is trying to solve. What it needs analytics for: While one advantage of Data Science is that it can provide a new perspective on an organization, the organization still needs to have a starting point. An organization may determine that the data is to be used to understand the business or the business environment; to prove ideas about the value of new products; to explore something that is unknown; or to invent a new way to do business. It is important to establish a gating process to evaluate these initiatives at several phases during the implementation. The value and feasibility of initiatives need to be evaluated at several points in time.
- What data sources to use or acquire: Internal sources may be easy to use, but may also be limited in scope. External sources may be useful, but are outside operational control (managed by others, or not controlled by anyone, as in the case of social media). Many vendors are competing in this space and often multiple sources exist for the desired data elements or sets. Acquiring data that integrates with existing ingestion items can reduce overall investment costs.
- The timeliness and scope of the data to provision: Many elements can be provided in realtime feeds, snapshots at a point in time, or even integrated and summarized. Low latency data is ideal, but often comes at the expense of machine learning capabilities – there is a huge difference between computational algorithms directed to data-at-rest versus streaming. Do not minimize the level of integration required for downstream usage.
- The impact on and relation to other data structures: There may need to be structure or content changes in other data structures to make them suitable for integration with Big Data sets.
- Influences to existing modeled data: Including extending the knowledge on customers, products, and marketing approaches.
Continue Reading : Big Data – Activities