Data in Two-dimensional Storage Structure

1. Data Objects

  1. Following objects can be edited in consistent way:

  1. Data are represented as two-dimensional storage structure:

2. Define Columns

2.1 Interface of Columns Management

Under tab “Columns”, add/delete/change columns in table view:

  1. Column names should not be null nor duplicated.

  2. Click table cell to edit value directly.

  3. Select one row and click button “Edit” to open edit window.

  4. Can rename all columns with sequence numbers.

  5. Can set random colors.

  6. Can adjust orders of columns.

  7. Click button "OK" to apply modifications of columns to "Table" of current data.

  8. Click button "Cancel" to discard modifications of columns and pick data from "Table" of current data.

  9. When changes have not been saved, if modifications have not been applied to data, “**” is displayed in tab header, or else “*” is displayed in tab header.

2.2 Types of Columns

  1. Types of columns include: String, Double, Float, Long, Integer, Short, Boolean, Datetime, Date, Era, Longitude, Latitude, Enumeration, Color.

  2. This attribute is used to display, edit, calculate, and save data.

  3. Longitutde and Latitude are defined together generally.

2.3 Format of Column

  1. This attribute is mainly for display. When data are inputted/edited, formats are not applied automatically and original inputs are kept.

  2. In some interfaces, like "Copy" or "Export", options "Save date/time/era and numbers as columns' formats" can be checked.

2.3.1 Format for Numbers

To numbers, format can be: group in thousands, group in ten thousands, scientific notation, and no format.

2.3.2 Format for Datetime/Date

To Datetime/Date, following are supported in formats: MM/dd/yy, yy-MM-dd, milliseconds, time zone, T separator, patch century, etc..

2.3.3 Format for Era

To Era, following are supported in formats: MM/dd/yy, yy-MM-dd, milliseconds, time zone, T separator, patch century, prefix/suffix of “AD” and “BC” in Chinese and English,etc..

2.3.4 Define Enumeration

To Enumeration, list of values can be defined.

2.4 Handle Invalid Values

  1. How columns handle invalid values, including: skip, count as empty, and count as zero.

  2. In some context, “count as empty” equals to “skip”.

  3. This attribute is only used for display or calculation. When data are inputted/edited, invalid values are not handled automatically.

3. Usages of Columns

Principle of column usages is “Most tolerability and least manufacture”.

3.1 Load Data

When load data, types of columns are not checked, and original values are read and imported.

3.2 Display Data

  1. Parse values as columns' types.

  2. Handle invalid values as columns' definitions.

  3. Rewrite values as columns' formats.

  4. Displayed values may be different from current actual values.

  5. Example, Column type is Double and value “abc” is read:

3.3 Controls for editing

3.4 Edit Data Cell

  1. Click editable cell to start editing.

  2. When editing is started, its original value is displayed, while both type and format of column are ignored.

  3. While user inputs and modifies the value, value in the edit control is checked by column type:

  1. Press key ENTER to commit modification, and press key ESC to cancel editing.

  2. Option: Auto-commit modification when cell loses focus(click other control).

  3. When modification is committed, value in the edit control is checked:

  1. Example, column type is Double, count invalid as empty, and decimal scale is 2. When read “abc”:

  1. Other data cells are not affected. That is, data cells are always in originial values if they are not changed.

3.5 Verify Data Values

  1. The verified objects are rows in current page.

  2. To start the verification:

  1. Following are checked:

  1. The results of verification are displayed in a html table.

3.6 Save Data

  1. Values are written into file of CSV/Texts/Excel as strings.

  2. Values are written into database table as nearest types:

Column Type of MyBox

Data Type of JDBC

String

VARCHAR

Double

DOUBLE

Float

FLOAT

Long

BIGINT

Integer

INT

Short

SMALLINT

Boolean

BOOLEAN

Datetime

TIMESTAMP

Date

DATE

Era

BIGINT

Longitude

DOUBLE

Latitude

DOUBLE

Enumeration

VARCHAR

Color

VARCHAR



Notice: derby does not support negative date, so Ear is saved as long.

  1. All values in Matrix are saved as Double.

3.7 Calculate Data

  1. Data are handled as original values, without concern about types and formats of columns.

  2. Values are parsed as need. Example, if the calculation requires double values, then try to convert values as doubles.

  3. If value conversion fails, then invalid value is handled as definition of column.

  4. Calculation itself can define how to handle invalid values. Definition of calculation priors to definition of column.

  5. Both column and calcultion can define decimal scale. Definition of calculation priors to definition of column.

  6. Example, Column type is String, and descriptive statistic is running against it:

3.8 Sort Data

  1. For any calculations involved in sorting, data will be translated into a temporary database table, and be sorted by database system.

  2. Results of sorting are related to columns' types. Example, string “124” is smaller than string “18”, while number “124” is bigger than number “18”.

3.9 Color of Column

Column color is mainly used for data charts.

When chart is generated, elements in it are displayed in colors as their columns' definitions.. Then user can set chart in random colors.

4. Edit Data

4.1 Mode "Table"

"Table" is the master edit mode:

    1. Its modifications are applied to other panes automatically.

    2. It is the final data to save.

    3. To string values, except for text file, multiple lines can be edited and saved:

4.2 Mode "CSV"

"CSV" is the assist edit mode.

    1. Click button "OK" to apply its modifications to "Table".

    2. Click button "Cancel" to discard its modifications and pick data from "Table".

    3. Click button "Delimiter" to pick data from "Table" and apply new delimiter while its modifications are discarded.

    4. This delimiter does not affect source file.

    5. If value contains delimeter or line break, it should be surrounded by quotes.

4.3 Save and Recover

  1. When changes are not saved, * is displayed in tab header.

  2. Click button "Save" to write modifications to file and database:

  1. Click button "Recover" to discard all modifications and load data from file and database.


5. Manage Data

MyBox records definition of data objects in its internal table:


csv/excel/text files

MyBox Clipboard

Matrice

Database tables

Time to create data definition

When data file is opened for the first time

When data are copied in MyBox Clipboard

When new matrix is saved in Matrices Manager

When new table is saved in Database Table Manager

Storage location of data

Data file

Files under MyBox internal path.

Database table of MyBox

Database tables of MyBox

When delete data definition

Data file is not affected

Internal file is deleted

Data of matrix are cleared

Database table is dropped



More details can be referred in "User Guide - Data Tools"