unable to initialize frontend: Dialog

These errors don’t stop the image from being built but inform you that the installation process tried to open a dialog box, but was unable to. Generally, these errors are safe to ignore.

Some people circumvent these errors by changing the DEBIAN_FRONTEND environment variable inside the Dockerfile using:

ENV DEBIAN_FRONTEND=noninteractive

This prevents the installer from opening dialog boxes during installation which stops the errors.

While this may sound like a good idea, it may have side effects. The DEBIAN_FRONTEND environment variable will be inherited by all images and containers built from your image, effectively changing their behavior. People using those images will run into problems when installing software interactively, because installers will not show any dialog boxes.

Because of this, and because setting DEBIAN_FRONTEND to noninteractive is mainly a ‘cosmetic’ change, we discourage changing it.

Apache Tajo™: A big data warehouse system on Hadoop

Dubbed an “SQL-on-Hadoop” solution, Apache Tajo is used for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities. Overall, Apache Tajo v0.9 delivers more powerful native SQL support on an even faster platform.


what does schema Less means to no-sql ?

Schema-less is a bit of a misnomer, it’s better to think of it as:

  • SQL = Schema enforced by a RDBMS on Write
  • NoSQL = Partial Schema enforced by the DBMS on Write, PLUS schema fully enforced by the Application on Read (Externalised schema)

So while a supposed Schema-less NoSQL data-store will in theory allow you to store any data you like (typically key value pairs, in a document) without prior knowledge of the keys, or data types, it will be pointless unless you have some mechanism to retrieve and use the data. So essentially the schema is partially moved from the RDBMS into the application code. I say partially as you’ll have added Indexes to document collections and or partitioned the data for performance, so the NoSQL DBMS will have a partial schema defined locally, and possibly enforced via Unique constraints.

As to adding additional attributes to a document/object in the store. Depending on how much padding is around the document (un-used space), in it’s physical data block, adding a few more key value pairs to the documents may result in the document having to be physically moved to a larger contiguous block of storage, and the associated indexes re-built. If you plan to use the new keys in a frequently utilised query then you’ll be wanting to also add a suitable new index, which will obviously require some physical storage, take a while to initially build and possibly lead you to ask the sysadmin to allocate more memory to the DBMS, to allow the new index(s) to be cached.

A good start is to read this paper here .Couchbase_Whitepaper_Transitioning_Relational_to_NoSQL

D3.js for visualisation with big data

D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.