Following objects can be edited in consistent way:
Data are represented as two-dimensional storage structure:
Data can be created by one of following ways:
New data are temporary data and need necessary parameters, like file name or table name, inputted when saved.
Existed data can be opened by one of following ways:
Except for texts file, multiple lines can be edited and saved to string values:
Click button or menu "Define Data".
Under tab "Columns", add/delete/change columns in table view:
Column names should not be null nor duplicated.
Click table cell to edit value directly.
Double click column, or select column and click button "Edit", to open eidt-column window:
Can rename all columns with sequence numbers.
Can set random colors.
Can adjust orders of columns.
Click button "OK" to apply modifications of columns to "Table" of current data.
Click button "Recover" to discard modifications of columns and pick data from "Table" of current data.
Click button "Select" copy column definitions from tree "Data Column".
Click button "Import" to load column definitions from XML file.
Click button "Export" to write column definitions as file. Supported formats include: XML/JSON/CSV/Excel.
Types of columns include: String, Double, Float, Long, Integer, Short, Boolean, Datetime, Date, Era, Longitude, Latitude, Enumeration, Editable Enumeration, Color.
This attribute is used to display, edit, calculate, and save data.
Longitutde and Latitude are defined together generally.
This attribute is mainly for display. When data are inputted/edited, formats are not applied automatically and original inputs are kept.
In some interfaces, like "Copy" or "Export", options "Save date/time/era and numbers as columns' formats" can be checked.
To numbers, format can be: group in thousands, group in ten thousands, scientific notation, and no format.
To Datetime/Date, following are supported in formats: MM/dd/yy, yy-MM-dd, milliseconds, time zone, T separator, patch century, etc..
To Era, following are supported in formats: MM/dd/yy, yy-MM-dd, milliseconds, time zone, T separator, patch century, prefix/suffix of "AD" and "BC" in Chinese and English,etc..
To Enumeration, list of values can be defined.
How columns handle invalid values, including: Use, Skip, Count as empty, and Count as zero.
Some data manufacture may set how to handle invalid values, which overrides column definition.
Column color is mainly used for data charts.
When chart is generated, elements in it are displayed in colors as their columns' definitions. Then user can set chart in random colors.
Under tab "Attributes", set: data name, decimal scale, maximum value of random, and description.
The verified objects can be either rows in current page or all rows.
Following are checked:
If a column is defined as "not null", then null values are invalid for this column.
If a value is not satisfied with the column type, then the value is invalid.
Handle invalid value when edit:
Reject invalid value, and recover original value.
To data file, there is option to not reject invalid value and it is accepted while shown in abnormal color.
To matrix or database table, invalid value is always rejected.
Handle invalid value when save:
Reject invalid value, and fail to save.
To data file, there is option to not reject invalid value and it is written in file.
To matrix or database table, invalid value is always rejected.
Handle invalid value in other operations:
Options to choice: Fail, Use, Skip, Count as null, Count as zero.
If select "Fail", operation fails when invalid value is found.
If select "Use", invalid value is taken and applied in operation.
If select "Skip", involved column or row is ignored and does not take part in operation.
If select "Count as null", invalid value is replaced with null and take part in operation.
If select "Count as zero", invalid value is replaced with zero and take part in operation.
In some context, "Count as null" equals to "Skip".
Principle of column usages is "Most tolerability and least manufacture".
When load data, types of columns are not checked, and original values are read and imported.
Parse values as columns' types.
Rewrite values as columns' formats.
Displayed invalid values in abnormal color.
To Boolean, checkbox is provided.
To Enumeration, list view is provided with selections.
To Color, palette is provided.
To Longitude and Latitude, map can be popped to locate coordinate.
Click editable cell to start editing.
When editing is started, its original value is displayed, while both type and format of column are ignored.
While user inputs and modifies the value, value in the edit control is checked by column type:
If value is invalid, then edit contorl is displayed in abnormal color.
If value is valid, then edit control is displayed in normal color.
The value is always kept as what user has inputted.
Press key ENTER to commit modification, and press key ESC to cancel editing.
Option: Auto-commit modification when cell loses focus(click other control).
When modification is committed, value in the edit control is checked:
If value is not changed, then no checking of column type nor saving.
If value is changed, then check the changed value as column type:
To invalid value, if select "Reject invalid value when edit", then restore and display original value; if not select rejection, then commit modification and display it in abnormal color.
To valid value, submit and save as new value, and then dsiplay saved value as type and format of the column.
Example, column type is Double, select "Reject invalid value when edit", and decimal scale is 2. When read "abc":
The data cell is displayed as "abc" in abnormal color.
User modifies it as "abc123":
While user inputs the change, the text field is always in abnormal color.
After user clicks Enter or other control, this data cell comes back as "abc" in abnormal color.
User modifies it as "123.4567":
While user inputs the change, the text field is always in normal color.
After user clicks Enter or other control, this data cell is saved as "123.4567" and displayed as "123.46".
Other data cells are not affected. That is, data cells are always in originial values if they are not changed.
Values are written into file of CSV/Texts/Excel as strings.
Option "Reject invalid value when edit".
Invalid values are always rejected for database table.
Values are written into database table as nearest types:
Column Type of MyBox |
Data Type of JDBC |
String |
VARCHAR |
Double |
DOUBLE |
Float |
FLOAT |
Long |
BIGINT |
Integer |
INT |
Short |
SMALLINT |
Boolean |
BOOLEAN |
Datetime |
TIMESTAMP |
Date |
DATE |
Era |
BIGINT |
Longitude |
DOUBLE |
Latitude |
DOUBLE |
Enumeration |
VARCHAR |
Color |
VARCHAR |
Notice: derby does not support negative date, so Ear is saved as long.
Invalid values are always rejected for matrix.
All values in Matrix are saved as Double.
Data are handled as original values, without concern about types and formats of columns.
Values are parsed as need. Example, if the calculation requires double values, then try to convert values as doubles.
If value conversion fails, then invalid value is handled as definition of column.
Calculation itself can define how to handle invalid values. Definition of calculation priors to definition of column.
Both column and calcultion can define decimal scale. Definition of calculation priors to definition of column.
Example, Column type is String, and descriptive statistic is running against it:
Try to convert each value as double.
If "count invalid value as zero", then invalid values are calculated into "Count", and participate in calculations of "Mean" and "Sum".
If "skip invalid value", then invalid values are not involved in any calculation.
If "count invalid value as empty", then invalid values cause results of all calculations as invalid(Double.NaN).
For any calculations involved in sorting, data will be translated into a temporary database table, and be sorted by database system.
Results of sorting are related to columns' types. Example, string "124" is smaller than string "18", while number "124" is bigger than number "18".
MyBox records definition of data objects in its internal table:
csv/excel/text files |
MyBox Clipboard |
Matrice |
Database tables |
|
Time to create data definition |
When data file is opened for the first time |
When data are copied in MyBox Clipboard |
When new matrix is saved in Matrices Manager |
When new table is saved in Database Table Manager |
Storage location of data |
Data file |
Files under MyBox internal path. |
Database table of MyBox |
Database tables of MyBox |
When delete data definition |
Data file is not affected |
Internal file is deleted |
Data of matrix are cleared |
Database table is dropped |
Data files are external data. MyBox records their definition and keeps their independence.
After read/write by MyBox, data files should be able to read/write by other ways properly.
In CSV file:
When MyBox handles CSV file:
In Excel file:
When MyBox handles Excel file:
Notice: Tool can only handle base data in Excel file. If file includes format, style, formula, or chart, suggest to save changes as new file to avoid data loss.
In texts file:
When MyBox handles texts file:
Temporary data files are generated when:
Temporary data are saved in CSV format, under internal temporary files path of MyBox by default.
If files are under MyBox temporary files path:
Option: Save temporary data under "generated" path.
When this option is checked, temporary data files are not treated as "temporary files"。
More details can be referred in "User Guide - Data Tools"