Following objects can be edited in consistent way:
Data files, including CSV File, Excel file, texts file.
Data of MyBox Clipboard
Matrices
Database tables.
Data are represented as two-dimensional storage structure:
"Columns" define attributes/fields of data and extend data dimensions in horizontal direction.
"Rows" store instances of data and extend data set in vertical direction.
Data should be in same width. That is all rows have equal number of columns.
Under tab “Columns”, add/delete/change columns in table view:
Column names should not be null nor duplicated.
Click table cell to edit value directly.
Select one row and click button “Edit” to open edit window.
Can rename all columns with sequence numbers.
Can set random colors.
Can adjust orders of columns.
Click button "OK" to apply modifications of columns to "Table" of current data.
Click button "Cancel" to discard modifications of columns and pick data from "Table" of current data.
When changes have not been saved, if modifications have not been applied to data, “**” is displayed in tab header, or else “*” is displayed in tab header.
Types of columns include: String, Double, Float, Long, Integer, Short, Boolean, Datetime, Date, Era, Longitude, Latitude, Enumeration, Color.
This attribute is used to display, edit, calculate, and save data.
Longitutde and Latitude are defined together generally.
This attribute is mainly for display. When data are inputted/edited, formats are not applied automatically and original inputs are kept.
In some interfaces, like "Copy" or "Export", options "Save date/time/era and numbers as columns' formats" can be checked.
To numbers, format can be: group in thousands, group in ten thousands, scientific notation, and no format.
To Datetime/Date, following are supported in formats: MM/dd/yy, yy-MM-dd, milliseconds, time zone, T separator, patch century, etc..
To Era, following are supported in formats: MM/dd/yy, yy-MM-dd, milliseconds, time zone, T separator, patch century, prefix/suffix of “AD” and “BC” in Chinese and English,etc..
To Enumeration, list of values can be defined.
How columns handle invalid values, including: skip, count as empty, and count as zero.
In some context, “count as empty” equals to “skip”.
This attribute is only used for display or calculation. When data are inputted/edited, invalid values are not handled automatically.
Principle of column usages is “Most tolerability and least manufacture”.
When load data, types of columns are not checked, and original values are read and imported.
Parse values as columns' types.
Handle invalid values as columns' definitions.
Rewrite values as columns' formats.
Displayed values may be different from current actual values.
Example, Column type is Double and value “abc” is read:
If count invalid as zero, then display it as zero(actual value is still “abc”).
If skip or count invalid as empty, then display it as “abc”.
To Boolean, checkbox is provided.
To Enumeration, list view is provided with selections.
To Color, palette is provided.
To Longitude and Latitude, map can be popped to locate coordinate.
Click editable cell to start editing.
When editing is started, its original value is displayed, while both type and format of column are ignored.
While user inputs and modifies the value, value in the edit control is checked by column type:
If value is invalid, then edit contorl is displayed in abnormal color.
If value is valid, then edit control is displayed in normal color.
The value is always kept as what user has inputted.
Press key ENTER to commit modification, and press key ESC to cancel editing.
Option: Auto-commit modification when cell loses focus(click other control).
When modification is committed, value in the edit control is checked:
If value is not changed, then no checking of column type nor saving.
If value is changed, then check the changed value as column type:
To invalid value, discard it, and then restore and display original value.
To valid value, submit and save as new value, and then dsiplay saved value as type and format of the column.
Example, column type is Double, count invalid as empty, and decimal scale is 2. When read “abc”:
The data cell is displayed as “abc”.
User modifies it as “abc123”:
While user inputs the change, the text field is always in abnormal color.
After user clicks Enter or other control, this data cell comes back as “abc” in normal color.
User modifies it as “123.4567”:
While user inputs the change, the text field is always in normal color.
After user clicks Enter or other control, this data cell is saved as “123.4567” and displayed as “123.46”.
Other data cells are not affected. That is, data cells are always in originial values if they are not changed.
The verified objects are rows in current page.
To start the verification:
Click button “Verify data in current page”.
Option: Verify values automatically when save data.
Following are checked:
If a column is defined as “not null”, then null values are invalid for this column.
If a value is not satisfied with the column type, then the value is invalid.
The results of verification are displayed in a html table.
Values are written into file of CSV/Texts/Excel as strings.
Values are written into database table as nearest types:
Column Type of MyBox |
Data Type of JDBC |
String |
VARCHAR |
Double |
DOUBLE |
Float |
FLOAT |
Long |
BIGINT |
Integer |
INT |
Short |
SMALLINT |
Boolean |
BOOLEAN |
Datetime |
TIMESTAMP |
Date |
DATE |
Era |
BIGINT |
Longitude |
DOUBLE |
Latitude |
DOUBLE |
Enumeration |
VARCHAR |
Color |
VARCHAR |
Notice: derby does not support negative date, so Ear is saved as long.
All values in Matrix are saved as Double.
Data are handled as original values, without concern about types and formats of columns.
Values are parsed as need. Example, if the calculation requires double values, then try to convert values as doubles.
If value conversion fails, then invalid value is handled as definition of column.
Calculation itself can define how to handle invalid values. Definition of calculation priors to definition of column.
Both column and calcultion can define decimal scale. Definition of calculation priors to definition of column.
Example, Column type is String, and descriptive statistic is running against it:
Try to convert each value as double.
If “count invalid value as zero”, then invalid values are calculated into “Count”, and participate in calculations of “Mean” and “Sum”.
If “skip invalid value”, then invalid values are not involved in any calculation.
If “count invalid value as empty”, then invalid values cause results of all calculations as invalid(Double.NaN).
For any calculations involved in sorting, data will be translated into a temporary database table, and be sorted by database system.
Results of sorting are related to columns' types. Example, string “124” is smaller than string “18”, while number “124” is bigger than number “18”.
Column color is mainly used for data charts.
When chart is generated, elements in it are displayed in colors as their columns' definitions.. Then user can set chart in random colors.
"Table" is the master edit mode:
Its modifications are applied to other panes automatically.
It is the final data to save.
To string values, except for text file, multiple lines can be edited and saved:
When the value is single line(not contain line break):
Text field is shown when click the data cell.
Write "\n" as line break in the value and commit the change(return or click other place).
When the value contains line breaks:
Text area is shown when click the data cell.
Write the text in multiple lines directly.
"CSV" is the assist edit mode.
Click button "OK" to apply its modifications to "Table".
Click button "Cancel" to discard its modifications and pick data from "Table".
Click button "Delimiter" to pick data from "Table" and apply new delimiter while its modifications are discarded.
This delimiter does not affect source file.
If value contains delimeter or line break, it should be surrounded by quotes.
When changes are not saved, * is displayed in tab header.
Click button "Save" to write modifications to file and database:
Changes of rows in "Table", including modify/add/delete/sort, affect rows of current page in file.
Changes in "Columns" tab, including modify/add/delete/sort, affect all rows in file.
Changes of attributes and columns are saved in database.
Click button "Recover" to discard all modifications and load data from file and database.
MyBox records definition of data objects in its internal table:
csv/excel/text files |
MyBox Clipboard |
Matrice |
Database tables |
|
Time to create data definition |
When data file is opened for the first time |
When data are copied in MyBox Clipboard |
When new matrix is saved in Matrices Manager |
When new table is saved in Database Table Manager |
Storage location of data |
Data file |
Files under MyBox internal path. |
Database table of MyBox |
Database tables of MyBox |
When delete data definition |
Data file is not affected |
Internal file is deleted |
Data of matrix are cleared |
Database table is dropped |
More details can be referred in "User Guide - Data Tools"