4.0 ICE User Guide
4.0 ICE User Guide
4.0 ICE User Guide
WWW.INFOBRIGHT.COM
COPYRIGHT NOTICE
The materials provided herein are Copyright © 2005-2011 Infobright Inc.
CONFIDENTIAL: The information contained in this document is the property of Infobright Inc. Except as
specifically authorized in writing by Infobright, the holder of this document shall keep the information
contained herein confidential and shall protect same in whole or in part from disclosure or dissemination
to third parties.
If these materials were purchased as a digital download, Infobright hereby grants the purchaser
permission to reproduce a single copy (print or download) of the materials without prior written
permission.
If these materials were purchased in printed form, no part of these materials shall be reproduced or
retransmitted by any means, electronic, mechanical, photocopying, recording, or otherwise without
written permission from Infobright.
Contents
1. About Infobright......................................................................................................................................................1
Infobright Overview.................................................................................................................................................1
Infobright and MySQL.............................................................................................................................................1
2. Setting up Infobright...............................................................................................................................................3
Technical Requirements...........................................................................................................................................3
Linux for Infobright..................................................................................................................................................4
Installing Infobright.................................................................................................................................................4
Windows Installation Instructions.....................................................................................................................5
Uninstalling on Windows...................................................................................................................................6
Linux Installation Instructions...........................................................................................................................7
RPM and DPKG Install...................................................................................................................................7
Uninstalling on Linux..........................................................................................................................................9
TAR Install.........................................................................................................................................................9
Windows Upgrade Instructions.......................................................................................................................11
Updating Table Structures (Versions Prior to ICE 3.3.2 Only)................................................................11
Linux Upgrade Instructions..............................................................................................................................12
RPM or DPKG Upgrade................................................................................................................................12
TAR Upgrade..................................................................................................................................................13
Updating Table Structures (Versions Prior to ICE 3.3.2 Only)................................................................13
Configuring Infobright..........................................................................................................................................15
Configuration Tips and Examples...................................................................................................................17
3. Using Infobright.....................................................................................................................................................19
Starting and Stopping the Infobright Server......................................................................................................19
Windows..............................................................................................................................................................19
Linux....................................................................................................................................................................19
Working with the Infobright Server.....................................................................................................................20
Windows..............................................................................................................................................................20
Linux....................................................................................................................................................................20
Checking the Infobright Version..........................................................................................................................21
Subqueries...........................................................................................................................................................52
Query Performance................................................................................................................................................52
1. About Infobright
Infobright Overview
Thank you for choosing to install Infobright Community Edition (ICE) 4.0.2 RC1. Infobright is
a column-oriented, high performance analytic engine designed for analytic applications and
data marts that need fast query response across large data volumes. Infobright was designed
specifically for large volume data analytics applications with up to 50TB of data.
Infobright executes complex or ad hoc queries across vast amounts of data with a low cost of
ownership.
Infobright consists of several layers. The upper layers are provided by the MySQL server
implementation, and the lower layers are provided by Infobright.
Infobright includes both its own optimizer and executor along with the storage engine. The
MySQL query engine can be used with Infobright; however, since the MySQL storage engine
interface is row oriented, it can not take full advantage of the column orientation or the
Knowledge Grid and hence query execution via this path is reduced. Queries will be directed
to the Infobright optimizer whenever possible.
Infobright ships with the full MySQL binaries required, including the MyISAM storage
engine. MyISAM is used to store catalog information (as with other storage engines) and you
can use the MyISAM instance for other purposes but joining MyISAM and Infobright tables
may result in reduced performance as the MySQL query engine will be used.
• Mature connectors, tools and resources • Load function that compresses data
Since other storage engines, like InnoDB and Falcon, are not included in the Infobright
distribution, they must be run as separate instances (executables). If you wish to combine
other storage engines with Infobright, you will need to look at a database federation
application (some BI tools provide this).
2. Setting up Infobright
Technical Requirements
Before installing Infobright, review the following technical requirements.
Debian “Lenny”
CentOS 5.2
Intel 32-bit
AMD 64-bit
AMD 32-bit
Important: 32-bit platforms are for solution testing purposes only, and not recommended for
performance or multi-user testing, or production deployments.
See Appendix C, “Linux Tuning Settings” on page 72 for a list of tuning suggestions.
Installing Infobright
The Infobright installation packages are provided as an RPM, DEB, PKG, .exe, or tarball. For
non-Windows platforms, the user installing Infobright must be the root user or a user with
the necessary permissions to install files, create the user mysql and create the group mysql.
6. The Install Wizard automatically creates ICE as a Windows Service, which allows the
Infobright server to be started and stopped automatically when you boot or shutdown
Windows. If you do not want ICE to start on boot, open the Services window from the
Control Panel and change the Startup Type for Infobright from “Automatic” to “Manual”.
7. The Install Wizard automatically determines the optimum memory settings based on
the physical memory of the system. You may change these settings by editing the file
brighthouse.ini within the data directory.
Important: The memory settings assume that there are no other services on the machine
consuming significant memory. If this is not the case, please lower the memory
settings for Infobright. See “Recommended Memory Configurations” on page
23.
Uninstalling on Windows
1. To uninstall ICE, select “Infobright Uninstall” under the Infobright program group in the
Windows Start Menu:
Start/All Programs/Infobright/Infobright Uninstall
www.infobright.org/Download/ICE/
Important: Do not install in the root or home directories due to possible MySQL permission
checking issues during install, start up, and/or load. If you use the rpm --prefix
option, you should manually create a softlink to the Infobright install directory
from /usr/local/infobright .
You can run this script at any time after installation to change the datadir, CacheFolder,
socket, or port. The script must be run as root, and Infobright must not be running.
Cachedir Path to the directory where temporary files will be created and stored. Should be
located on a fast drive, possibly not the same as the data. Allow at least 100 GB of
free space (depending on database size).
Socket Socket connection point for client connections. (The socket connection point will be
created during the Infobright installation.)
4. The installation determines the optimum memory settings based on the physical memory
of the system. You may change these settings by editing the file brighthouse.ini within
the data directory. See “Recommended Memory Configurations” on page 23.
Important: The memory settings assume that there are no other services on the machine
consuming significant memory. If this is not the case, please lower the memory
settings for Infobright.
Uninstalling on Linux
• To uninstall Infobright, run:
rpm -e infobright
or
dpkg -r infobright
TAR Install
2. Change to the parent location in which you want to install (e.g. /usr/local) :
cd /usr/local
Important: Do not install in the root or home directories due to possible MySQL permission
checking issues during install, start up, and/or load.
3. Unpack the tarball, which will create the product directory (e.g. infobright- 4.0.2-x86_64_
ice and create a symbolic link ‘infobright’ to the product folder
gunzip < /path/to/infobright- 4.0.2-x86_64_ice.tar.gz | tar xvf -
ln -s /usr/local/infobright- 4.0.2-x86_64_ice infobright
cd /usr/local/infobright
4. Run the install script with the “--help” flag to check for system configuration and provide
examples of directory parameters
./install-infobright.sh –help
Parameters required:
--datadir=infobright data folder
[--datadir=/usr/local/infobright/data]
--cachedir=infobright cache folder
[--cachedir=/usr/local/infobright/cache]
--config=mysql conf file to be created
[--config=/etc/my-ib.cnf]
--port=infobright server port [--port=5029]
--socket=socket file to be used by this server
[--socket=/tmp/mysql-ib.sock]
--user=user to be created if not exist [--user=mysql]
--group=user group to be created if not exist [--group=mysql]
Cachedir Path to the directory where temporary files will be created and stored. Should
be located on a fast drive, possibly not the same as the data. Allow at least 100
GB of free space (depending on database size).
Config MySQL configuration file. (The configuration file will be created with defaults
during the Infobright installation.)
Socket Socket connection point for client connections. (The socket connection point
will be created during the Infobright installation.)
User System user who can run the Infobright server instance. User will be created if
it does not exist. The default user is mysql.
Group System group for the above user. Group will be created if it does not exist. The
default group is mysql.
Run the install script again, this time with directory parameters. If parameters are used that
already exist, an error will occur (for example running the same script with parameters
twice).
Example command:
./install-infobright.sh --datadir=/usr/local/infobright/data --cachedir=/
usr/local/infobright/cache --port=5029 --config=/etc/my-ib.cnf --socket=/tmp/
mysql-ib.sock --user=mysql --group=mysql
5. Change the default memory configuration by editing the file brighthouse.ini within the
data directory. See “Recommended Memory Configurations” later in this chapter.
Important: It is critical that you increase the memory settings for systems running more than
2GB of physical memory or performance will be severely impacted.
1. Please follow the standard ICE Windows installation instructions. The Install Wizard
automatically detects a previous version of ICE and upgrades your ICE installation while
preserving your data and configuration settings. The install procedure automatically runs
the Configuration Manager.
3. Create or ensure that the directory c:\tmp exists (necessary for step 4).
4. Run the MySQL Upgrade utility from the Windows command line:
cd “C:\Program Files\Infobright\bin”
mysql_upgrade.exe --defaults-file=”c:\Program Files\Infobright\my-ib.ini”
-uroot --tmpdir=c:\tmp
Important: The MySQL Upgrade utility may display several errors regarding the use of
locks with log tables and errors requiring table upgrades. The errors are all
handled automatically by Infobright and/or the upgrade utility and can be
ignored.
5. Stop and start the Infobright server from the Start Menu items.
6. If you are upgrading from a version prior to ICE 3.3.2, you must update your table
structures. See the next section for details.
If you are upgrading from a version prior to ICE 3.3.2, you must update your table structures
after upgrading ICE. Do NOT follow these instructions if you are upgrading from ICE 3.3.2
or higher or you may experience data corruption. If you are unsure what version of ICE you
are using, please contact Professional Services.
2. Run the Charset Migration Tool from the Windows command line:
cd “C:\Program Files\Infobright\bin”
chmt.exe –datadir=\absolute\path\to\data\directory
To upgrade using the rpm or deb package, simply run the installation command and the
package will automatically identify that Infobright is already installed and switch to upgrade
mode. Your configuration settings and data will not be changed during the upgrade.
Important: If the previous installation was done using the tarball package, you must
upgrade using the tarball package (see instructions below) or contact Infobright
Support to move from a tar install to a package install.
Important: The MySQL Upgrade utility may display several errors regarding the use of
locks with log tables and errors requiring table upgrades. The errors are all
handled automatically by Infobright and/or the upgrade utility and can be
ignored.
6. If you are upgrading from a version prior to ICE 3.3.2, you must update your table
structures. See “Updating Table Structures (Versions Prior to ICE 3.3.2 Only)” on page
13 for details.
TAR Upgrade
1. Unpack the tarball into a temporary folder. Use the gunzip utility for unpacking:
cd /path/to/temp/
gunzip < /path/to/infobright-4.0.2-x86_64.tar.gz | tar xvf -
3. Run the install script with the “--upgrade” and “--config” flags and pass in the
configuration files of the previously installed version:
./install-infobright.sh --upgrade --config=/etc/my-ib.cnf
4. Start the Infobright server and run the mysql_upgrade utility:
/etc/init.d/mysqld-ib start
cd /usr/local/infobright
./bin/mysql_upgrade --defaults-file=/etc/my-ib.cnf --user=root --tmpdir=/tmp
Important: The MySQL Upgrade utility may display several errors regarding the use of
locks with log tables and errors requiring table upgrades. The errors are all
handled automatically by Infobright and/or the upgrade utility and can be
ignored.
7. If you are upgrading from a version prior to ICE 3.3.2, you must update your table
structures. See “Updating Table Structures (Versions Prior to ICE 3.3.2 Only)” on page
13 for details.
If you are upgrading from a version prior to ICE 3.3.2, you must update your table structures
after upgrading ICE. Do NOT follow these instructions if you are upgrding from ICE 3.3.2
or higher or you may experience data corruption. If you are unsure what version of ICE you
are using, please contact Professional Services.
cd /usr/local/infobright
./bin/chmt –datadir=/absolute/path/to/data/directory
Configuring Infobright
The Infobright configuration file is called brighthouse.ini and is located in the data
subdirectory within your Infobright installation directory. The configuration file is a text file
containing the Infobright configuration parameters. See the Infobright installation package
for a sample brighthouse.ini file.
Important: It is critical that you specify increased memory settings for systems running more
than 2GB of physical memory to ensure optimal performance.
Each parameter is shown on a separate line and uses the following form:
ParameterName=ParameterValue
If a parameter is not present in the configuration file or if the configuration file does not exist,
the default values are used. Blank lines and comments (lines starting with #) are ignored.
LoaderMainHeapSize=size Not less than 320 Size of the memory heap in the loader
process, in MB. The sum of the heap sizes
Default: 320 in the server and the loader should not
exceed physical memory installed in the
machine, otherwise performance decreases
radically.
Note: The values are commented out (preceded by #) in the brighthouse.ini file which
causes them to default to the application minimum allowed values of 600 and 320 for
ServerMainHeapSize and LoaderMainHeapSize respectively.
The following table shows sample memory configurations for different systems.
In most cases, the loader does not benefit from larger memory settings. However, increasing
the LoaderMainHeapSize can help when:
You can use more memory at import if you are planning to execute several concurrent load
tasks to different data tables. However, disk access may become a bottleneck.
ServerMainHeapSize should be as large as possible but safely smaller than the amount of
physical memory in the machine. If performance decreases because of memory swapping by
the operating system, try to set lower heap sizes. We also recommend decreasing the heap
size if many users are running queries in parallel.
Important: Infobright may use additional memory for heavy loads or queries. Also, other
applications on your server will use memory for their processes. It is important
that the total of ServerMainHeapSize and LoaderMainHeapSize is less than
the total available physical memory. If the system needs to swap memory,
performance will be severely impacted.
3. Using Infobright
Windows
The Windows Install Wizard automatically creates Infobright as a Windows Service, which
allows the Infobright server to be started and stopped automatically when you boot or
shutdown Windows.
• To manually start the Infobright server, from the Windows Start Menu run:
Start/All Programs/Infobright/Infobright Start
• To manually stop the Infobright server, from the Windows Start Menu run:
Start/All Programs/Infobright/Infobright Stop
Linux
You can start and stop the Infobright server the same way you would start and stop the
original MySQL server (mysqld). Before using the Infobright server, see “Starting and
Stopping MySQL Automatically” in the MySQL 5.1 Reference Manual.
Important: It is recommended that you run Infobright using MySQL user credentials rather
than root for security reasons.
• To start/stop the Infobright server during system boot/shutdown use the mysqld-ib script
in /etc/init.d/ for start and stop services. Use run level 2 3 4 5 to start the service, and run
level 0 1 6 to stop.
You can also use GUI tools, such as the MySQL Workbench provided by MySQL AB, to query
Infobright databases in a more graphical manner.
You can use the mysql client program to perform the following actions. For more
information, see “Tutorial” in the MySQL 5.1 Reference Manual.
Windows
• To connect to the Infobright command line interface, run:
Start/All Programs/Infobright/Infobright Command Line Client
Linux
• If you used the standard install locations, enter the following command to connect to
Infobright:
/usr/bin/mysql-ib
If you used a different install location, modify the above command to point to your
socket file.
• When the Infobright server is first installed, an administrator account with no password is
created. To connect to the administrator account, use the following command:
mysql-ib
• To run a script when connecting to the administrator account, use the following
command:
mysql-ib < input_script_name.txt
For example:
mysql-ib < /tmp/testing/input.txt
• To run a script when connecting to the administrator account and direct all output to a
text file, use the following command:
For example:
mysql-ib < /tmp/testing/input.txt > /tmp/testing/output.txt
During the Infobright server shutdown process, the server will not shut down until all
running commands are completed.
Infobright can be used with most Business Intelligence tools and any MySQL GUI client tool
like Toad or Navicat. Simply point to the IP address and socket number for the Infobright
server, and logon using any user credentials that have been set up.
• After connecting to the Infobright administrator account, enter the following command at
the mysql command prompt:
mysql> show variables like “version_comment”;
+-----------------+----------------------------------------------------+
| Variable_name | Value |
+-----------------+----------------------------------------------------+
| version_comment | build number (revision)=IB_4.0.2_r5IB_3.2_GA_5316 |
+-----------------+----------------------------------------------------+
1 row in set (0.00 sec)
The following information is displayed at the command prompt. In this example, MyISAM
is shown as the default storage engine. You can combine the usage of different storage
engines but you should avoid joining across storage engines as this can result in sub-optimal
performance due to the use of the MySQL query engine. However it can be quite useful in
some cases to store query results in Memory or MyISAM tables and do further manipulations
of results.
mysql> show engines;
+-----------+---------+----------------------------------------------------------+-------------+---+-----------+
|Engine |Support |Comment |Transactions |XA |Savepoints |
+-----------+---------+----------------------------------------------------------+-------------+---+-----------+
|BRIGHTHOUSE|DEFAULT |Infobright storage engine |YES |NO | NO |
|MRG_MYISAM |YES |Collection of identical MyISAM tables |NO |NO | NO |
|CSV |YES |CSV storage engine |NO |NO | NO |
|MEMORY |YES |Hash based, stored in memory, useful for temporary tables |NO |NO | NO |
|MyISAM |YES |Default engine as of MySQL 3.23 with great performance |NO |NO | NO |
+-----------+---------+----------------------------------------------------------+-------------+---+-----------+
5 rows in set (0.00 sec)
log-error=<filename>
log-output=FILE
Infobright log Server start and stop information. Also contains missing
configuration settings.
Note: In general more detail in the log may have an impact on performance so its
recommended users find and use the setting that strikes the best balance for them in
terms of performance versus log details.
About Errors
Infobright reports the same errors as the standard MySQL server. For more information,
see “Appendix B. Errors, Error Codes, and Common Problems” in the MySQL 5.1 Reference
Manual.
There are a few additional errors specific to Infobright import and export commands. For
more information, see “About Import Errors” on page 44 and “About Export Errors” on
page 44.
There are special considerations when using the following commands with Infobright. All
other SQL commands can be used with Infobright as they are with the standard MySQL.
For example, the SQL standard does not define a default collation for string comparisons,
which affects the ordering of query results. Different databases will implement different
collation approaches, thus displaying inconsistent results for such things as sorts.
Within the data subdirectory, Infobright databases are stored in separate subdirectories.
Within each database subdirectory, data files for each Infobright table are stored in separate
subdirectories.
Important: Do not manually copy a data table from one database to another by copying
the database files—internal table numbering errors and Knowledge Grid
inconsistencies may occur. To copy a table, use import and export commands
(see Chapter 7, “Importing and Exporting Data in Infobright” on page 51) or
back up the entire database directory (see Chapter 9, “Infobright Backup and
Recovery” on page 72).
The Infobright server uses additional directories to store temporary data, and optimization
information, such as Knowledge Nodes. The following shows the data directory, containing
the Infobright databases:
[root@ib03 data]# pwd
/usr/local/infobright/data
[root@ib03 data]# ls
BH_RSI_Repository
Infobright.log
Infobright.seq
ib03.corp.infobright.com.err
mysql
test
Numeric Types
Data Type Minimum Maximum
TINYINT -127 127
0 <= D <= M
String Types
Data Type Maximum Length
CHAR(N) 255
VARCHAR(N) 65532
BINARY(N) 255
VARBINARY(N) 65532
TINYTEXT 255
TEXT(N) 65535
Important: Do not manually copy a data table from one database to another by copying
the database files—internal table numbering errors and Knowledge Grid
inconsistencies may occur. To copy a table from one database to another, export
from the source database and then import into the target database (see Chapter
7, “Importing and Exporting Data in Infobright” on page 38) or back up the
entire database directory (see Chapter 9, “Infobright Backup and Recovery”
on page 54). You can rename the entire database by renaming the folder.
However, you should not copy a database folder from one active instance to
another, or within the same active instance.
See “About Column Options” on page 28 for information on supported and unsupported
options when creating columns.
Note: When creating a table, as a matter of practice one should always use the ENGINE=
option to ensure that the correct database engine is used. Infobright is shipped with
DEFAULT ENGINE = BRIGHTHOUSE, but this can be changed. The name of the engine
can be specified explicitly at the end of create table statement:
• NOT NULL replaces the imported NULL values with default values such as 0 (zero) for
numeric columns and an empty string (‘’) for string columns.
Lookup Columns
Infobright provides an additional modifier for string data type columns, called a lookup
column. The lookup column utilizes an integer substitution for values. You can declare a
lookup column on a CHAR or VARCHAR column to increase its compression and performance
in queries. However, to use a lookup column, the CHAR or VARCHAR column must meet the
following criteria:
• There is no fixed upper limit for unique values in the column (cardinality). The total size
of a dictionary, being the total length of all distinct values, will be loaded into RAM (for
example: 1 million distinct values that are each 100-character wide will permanently
occupy 100 MB of RAM.) As a rough guideline, the ratio of total number of records to
distinct values should be reasonably high (greater than 10).
• The column must contain a large number of duplicate values: the ratio of total number of
records to distinct values should be greater than 10.
Typically, a lookup column is useful for fields like state, gender, category, and the like
where the number of instances is very high, but the number of unique values is very low.
To determine the ratio of records to distinct values, determine the number of distinct values
using SELECT COUNT (DISTINCT <COLUMN>) FROM… Then compare this to the number of
records using a SELECT COUNT(<COLUMN>) FROM…
Note: Using a lookup on a column where there are more than 10,000 distinct values will
result in greatly reduced load speeds.
• To declare a column as a lookup column, add the comment ‘lookup’ on the column. Enter
the following command:
mysql> create table …
(…
<<column name>> <<column type>> … comment ‘lookup’ …
…)
engine=brighthouse;
• default values
• keys
• indices
• unique columns
• auto-increment columns
For more information, see “SHOW COLUMNS Syntax” in the MySQL 5.1 Reference Manual.
Utilization of the FULL option will provide an estimate of the compression for each column.
• To view the CREATE TABLE statement used to create a given table, enter the following
command:
SHOW CREATE TABLE tbl_name;
For more information, see “SHOW CREATE TABLE Syntax” in the MySQL 5.1 Reference
Manual.
mysql> show create table dim_cars;
+----------+--------------------------------------------------------------------+
| Table | Create Table |
+----------+--------------------------------------------------------------------+
| dim_cars | CREATE TABLE `dim_cars` (
`make_id` decimal(10,0) DEFAULT NULL,
`make_name` varchar(25) DEFAULT NULL,
`model_name` varchar(25) DEFAULT NULL,
`record_dt` datetime DEFAULT NULL
) ENGINE=BRIGHTHOUSE DEFAULT CHARSET=latin1 |
+----------+--------------------------------------------------------------------+
1 row in set (0.00 sec)
For more information, see “SHOW TABLE STATUS Syntax”in the MySQL 5.1 Reference
Manual.
---------------------+---------------------+------------+-------------------+----------+----------------+-
---------------------------------+
Create_time | Update_time | Check_time | Collation | Checksum | Create_options |
Comment |
---------------------+---------------------+------------+-------------------+----------+----------------+-
---------------------------------+
The following natural sizes (in bytes) are defined for various data types. Note the following:
• For all data types, if the column is not declared as NOT NULL, add one bit per value for NULL
indicators.
• These data sizes take into account the typical format of data display, for example “yyyy-
mm-dd” for DATE or decimal point for DEC. The size also counts the bytes that store the
actual text length (VARCHAR).
VARCHAR(n), VARBINARY(n) (total number of bytes used—i.e., the total length of all
strings, excluding terminating characters) + 2*(number of
rows)
The optional like clause can be used to filter the tables. Note that the table name must be
provided in single quotes.
The compression statistics are provided in the table comment. For example:
mysql> show table status from test like ‘t1’ \G
*********************** 1. Row **********************
Name: t1
Engine: BRIGHTHOUSE
Version: 10
Row_format: Compressed
Rows: 3430387
Avg_row_length: 0
Data_length: 0
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: NULL
Create_time: 2008-09-04 15:31:39
Check_time: NULL
Update_time: 2008-09-04 15:35:30
Collation: ascii_bin
Checksum: NULL
Create_options:
Comment: Overall compression ratio 39.908
1 row in set (0.59 sec)
A database name and a column filter can be specified in optional clauses. For more
information, see “SHOW COLUMNS Syntax” in the MySQL 5.1 Reference Manual.
The compression statistics are provided in the column comment. In addition to the
compression information, the comment line may also contain a “unique” indicator , meaning
that the column has all unique values (except nulls).
For example:
When using GUI tools with Infobright, such as MySQL Browser, use these tools in read-only
mode only. Do not use these tools to insert, update, or delete data. This may result in errors
and the hanging of the GUI application.
To insert data into Infobright tables, use the MySQL import command. For more information,
see Chapter 7, “Importing and Exporting Data in Infobright” on page 38.
Important: Queries that evaluate against UTF-8 character data columns will execute with
less performance than and equivalent query against ASCII character data, due
to ASCII support of Character Maps in the Knowledge Grid (see Chapter 8,
“Running Queries in Infobright” on page 66). UTF-8 specific Knowledge Grid
extensions will be available in an upcoming release.
• utf8_bin • utf8_polish_ci
• utf8_czech_ci • utf8_roman_ci
• utf8_danish_ci • utf8_romanian_ci
• utf8_esperanto_ci • utf8_slovak_ci
• utf8_estonian_ci • utf8_slovenian_ci
• utf8_hungarian_ci • utf8_spanish_ci
• utf8_icelandic_ci • utf8_swedish_ci
• utf8_latvian_ci • utf8_turkish_ci
• utf8_lithuanian_ci • utf8_unicode_ci*
• utf8_persian_ci
*utf8_unicode_ci properly handles both French and German collation, so specific collation
types for these languages are not necessary.
For more information, see “Unicode Support” in the MySQL 5.1 Reference Manual.
The SQL standard does not define a default collation; therefore, many DBMS engines have
different default collations and produce different results. As a result, there are several
differences between Infobright and other DBMS engines.
• For Infobright, character data types are case-sensitive. For example, the condition
‘toronto’=‘Toronto’ is not true in Infobright. Similarly, the condition LIKE ‘Abc%’ is not
true for ‘abcde’.
• The Infobright sorting order is “A…Z a…z” (for example ‘Zeta’ < ‘alfa’), which is the
same sorting order as used by Oracle. The Infobright sorting order is different than the
default MySQL sorting order, which mixes lowercase and uppercase; the SQL Server
order, which is “aAbB…zZ”; and the DB2 order, which is “AaBb…Zz”.
• The Infobright sorting order affects ORDER BY results, GROUP BY results (which is the
order of groups and their definitions—for example, ‘aaa’ and ‘AAA’ define different
groups) and DISTINCT results. WHERE conditions may also be affected if you are expecting a
different sorting order than the one used by Infobright.
• To simulate Infobright collation in the MySQL engine, set latin1_bin collation while
creating a table (for more information, see “Table Character Set and Collation” in the
MySQL 5.1 Reference Manual). Enter the following command:
mysql> create table … collate ascii_bin;
Padding
Infobright treats padding differently than other DBMS engines. Infobright assumes literal
comparisons of text fields, including all whitespace characters. Therefore, a string containing
two spaces is different than a string containing one space or an empty (0 length) string, which
is also different than the NULL value.
The Infobright padding definition is compatible with the SQL standard. However, most
DBMS systems have defined less restricted, customizable rules regarding text comparison.
For example, ‘abc ’ = ‘abc’ may be true in some databases but is not true in Infobright.
Note: In CHAR columns, trailing spaces are trimmed on LOAD, whereas in VARCHAR
columns values are loaded with all spaces.
About Transactions
• To enable the use of COMMIT and ROLLBACK commands in Infobright, you must disable
AUTOCOMMIT. Enter the following command:
mysql> set autocommit=0;
You can disable AUTOCOMMIT by setting the parameter to 0 (zero) and enable AUTOCOMMIT
by setting the parameter to 1. If AUTOCOMMIT is set to 1, then when a LOAD is completed, the
transaction is automatically committed.
• If you have not yet committed a LOAD DATA INFILE transaction, you can rollback the
transaction. This will restore the import tables to the state that existed before the current
transaction. Enter the following command:
mysql> rollback;
Using COMMIT and ROLLBACK makes it possible to check the load within the same session
before committing the data, as the loaded data is available (viewable) to the load session.
For instance, you could check something about the data (number of records load) before
committing.
After importing data using the LOAD DATA INFILE command, the status of the import and
the number of affected rows is shown. All uncommitted rows, including those from previous
imports, are shown; therefore, the number of affected rows may be greater than the number
of rows in the file you just imported.
• Queries to the table are not executed until the current import is complete and the
operation is committed.
• Until the currentwrite operation is committed, all subsequent write commands to the table
are queued. They will wait for the write lock to be released before proceeding in the order
they were received.
In general, Infobright uses table level locking where only one LOAD operation can execute at
one time and after queries have completed.
Failure Handling
If AUTOCOMMIT is disabled and the Infobright server is terminated during an import session,
the following occurs:
• Infobright does not store the rows that were loaded during the failed import operation.
• The input file and the database files are not harmed. To load data from the input file,
repeat the LOAD operation.
If AUTOCOMMIT is disabled and the Infobright server is terminated after an import session is
completed successfully but is not committed, the following occurs:
• The transaction is rolled back and the imported data is lost when the server restarts.
• The input file and the database files are not harmed by the failed import operation (the
database is unaffected, as if the import session did not occur). To re-import the data,
repeat the LOAD operation.
If the Infobright server is terminated during an export operation to a disk file, the following
occurs:
• A non-empty file is saved on disk; however, the last row in the saved file is inconsistent.
• The database files are not harmed by the failed export operation. To export the data,
repeat the export operation.
If Infobright tries to import data from a file created during a failed export session, the
following occurs:
• No data is inserted because the input file consists of corrupted table rows. No new records
are added to the database files, so no harm is done.
Escape Characters
The Infobright Loader supports escape character definition and usage.
Other DBMS systems may have different representations of the NULL value; for example,
MySQL only recognizes the representation \N for a NULL value. This can create issues if you
export data from Infobright and import the data into MySQL. Since MySQL will only look for
\N and will not recognize the Infobright representation of the NULL value, MySQL will change
the NULL value into the default values in numeric and string columns.
Importing Data
• To import data into an Infobright table, use the following MySQL loading command:
LOAD DATA INFILE ‘file_name’ INTO TABLE tbl_name
[FIELDS
[TERMINATED BY ‘char’]
[ENCLOSED BY ‘char’]
[ESCAPED BY ‘char’]
];
where:
For more information, see “LOAD DATA INFILE Syntax” in the MySQL 5.1 Reference Manual.
A few important notes:
• Before importing data using the the LOAD DATA LOCAL syntax, be sure you fully
understand the security issues. See “Security Issues with LOAD DATA LOCAL“ in the
MySQL 5.1 Reference Manual for details.
• You can disable all LOAD DATA LOCAL statements from the server side by starting mysqld
with the --local-infile=0 option.
• For the mysql command-line client, enable LOAD DATA LOCAL by specifying the --local-
infile[=1] option, or disable it with the --local-infile=0 option. For mysqlimport, local
data file loading is off by default; enable it with the --local or -L option. In any case,
successful use of a local load operation requires that the server permits it.
• Some (but not all) Windows GUI tools may work with remote load, even with Linux
servers.
To import data into an Infobright table from a remote machine across the network, use the
following MySQL loading command (for more information about command options, see the
Data Loading Guide):
LOAD DATA [LOCAL] INFILE ‘file_name’ INTO TABLE tbl_name
[FIELDS
[TERMINATED BY ‘char’]
[ENCLOSED BY ‘char’]
[ESCAPED BY ‘char’]
];
where:
If LOCAL is specified, the file is read by the client program on the client host and sent to
the server. The file can be given as a full path name to specify its exact location. If given as
a relative path name, the name is interpreted relative to the directory in which the client
program was started.
Note Network speeds may limit the load speed. Exceptions and errors in the transfer are
handled by the MySQL client, and will behave the same as the MySQL client.
Exporting Data
• To export data from an Infobright table, use the following MySQL export command:
SELECT … INTO OUTFILE ‘file_name’
[FIELDS
[TERMINATED BY ‘string’]
[ENCLOSED BY ‘char’]
[ESCAPED BY ‘char’]]
FROM ‘tbl_name’;
where:
tbl_name = name of the table from which the data will be retrieved
For more information on export syntax, see “SELECT Syntax” in the MySQL 5.1 Reference
Manual.
You can use the optional FIELDS clause to specify how values are provided in the input file.
To use the FIELDS clause, the following must be true:
Within the FIELDS clause, you can use the following sub clauses:
• Use the TERMINATED BY sub clause to specify the character recognized as the separator
(delimiter) between values. By default, a semicolon ‘;’ is assumed to separate values.
• Use the ENCLOSED BY sub clause to specify the character that begins and ends each string
representing a text value. By default, a double quotation mark ‘”’ is assumed to enclose
each value. If the text values in the input file do not use any enclosing characters, use the
value ‘NULL’ in the ENCLOSED BY sub clause. Note that this is the same as using the empty
string ‘’ option in standard MySQL.
• Use the ESCAPED BY sub clause to support special characters that may be imbedded within
text fields.
• If a NULL value is imported into a column defined as NOT NULL (except for TIMESTAMP
columns), it is replaced by 0 (for numerical, date and time columns) or by an empty string
(for string columns).
To set up a Linux pipe, you need to run the mkfifo command from Linux, and ensure that the
pipe is accessible to Infobright. In the following example the pipe is setup as /pipe_test/
thepipe.pipe. You can use the directory and name of your choice.
mkfifo /pipe_test/thepipe.pipe
chmod 666 /pipe_test/thepipe.pipe
Once the pipe is set up, direct the data either by directing a file or a process to the pipe:
cat /usr/tmp/jkvarload.txt > /pipe_test/thepipe.pipe &
2 Wrong data or column Format of data does Ensure the data being
definition not comply with table imported is the correct
definition data type and does not
exceed the size specified
6 Wrong parameter Wrong value for one of the Make sure the correct
loading parameters parameter is used (see
“Setting Import and Export
Parameters” on page 57)
7 Data conversion error A value in data cannot be Ensure the data is the
converted to a column type correct column type
6 Wrong parameter Wrong value for one of the Make sure the correct
export parameters parameter is used (see
“Setting Import and Export
Parameters” on page 57)
7 Data conversion error Not used Ensure the data is the correct
column type
The following sample script creates a table called customers, sets Infobright as the default
engine, imports data from an existing text file and exports the data.
USE Northwind;
PostalCode char(10),
Country char(15),
Phone char(24),
Fax varchar(24),
CreditCard float(17,1),
FederalTaxes decimal(4,2)
) ENGINE=BRIGHTHOUSE;
Smallint
Mediumint
Int
BigInt
Double
Smallint
Mediumint
Int
BigInt
Float
Double
Decimal(N, M)
Smallint
Mediumint
Int
BigInt
Float
Double
Decimal(N, M)
DPN (Data Pack Nodes) Statistical metadata that describes the content of the Data
Pack. Used to both assist in data access and in rough
operations.
Running Queries
• To run queries on Infobright tables, use the following standard MySQL syntax:
mysql> select …;
The Infobright Optimizer is the primary engine used to resolve queries. While significant
additions have been made to the library of supported SQL, there are cases where the query
will still be executed by the MySQL query engine instead of the Infobright engine. In this
event, query response time tends to suffer due to the fact that the MySQL engine is row-
oriented and therefore cannot make use of the Knowledge Grid information, and in some
cases it can be too slow to be usable. For best performance, ensure your queries (and VIEWs)
contain only syntax supported by the Infobright Optimizer. For more information, see
“Infobright Optimizer – Supported Functions and Operators” on page 73 for select syntax
supported in Infobright”
If the MySQL query path is disabled, then the following message will be returned if the query
would have otherwise been directed to MySQL for processing:
The query includes syntax that is not supported by the Infobright Optimizer.
Infobright suggests either restructure the query with supported syntax, or
enable the MySQL Query Path in the brighthouse.ini file to execute the query
with reduced performance.
This will occur when functions not optimized in Infobright are used. If you get poor query
performance, you should execute the command below to identify if a query has been directed
to the MySQL query engine.
• After running a query, enter the following command to view any warnings:
mysql> show warnings;
The following message indicates that the query was directed to MySQL for processing:
1105 | Query syntax not implemented in Brighthouse, executed by MySQL engine.
Important: When queries are executed on Infobright tables by the standard MySQL engine,
performance can be significantly slower than when queries are executed by
Infobright .
Terminating a Query
If you want to terminate a query executed from a client session before the query is complete,
do the following:
1. Use the show [full] processlist command to determine the query’s process ID.
OR
If you are using a command-line MySQL client, you can also use Ctrl+C to terminate the
query.
A VIEW must contain unique column names. If you select two columns with the same name
from separate tables, at least one must be aliased or the column list option must be used.
If the View’s select statement contains functionality that is not supported in the Infobright
optimizer, then the VIEW will perform sub-optimally since it will always flip over to the
MySQL query engine.
Select Syntax
For more information, see “SELECT Syntax” in the MySQL 5.1 Reference Manual.
SELECT [ ALL | DISTINCT | DISTINCTROW ]
Select_expr, …
[ FROM table_references
[ WHERE where_condition ]
[ GROUP BY {col_name | expr | position} ]
[ HAVING where_condition ]
[ ORDER BY {col_name | expr | position } [ ASC | DESC ], … ]
[ LIMIT { [ offset,] row_count | row_count OFFSET offset} ]
[ INTO OUTFILE ‘file_name’ export_options
- AS alias_name
- ORDER BY NULL ]
Join Syntax
For more information, see “JOIN Syntax” in the MySQL 5.1 Reference Manual.
Infobright supports the following JOIN syntax for the table_references part of SELECT
statements (as described in the previous section, “Select Syntax”):
table_factor:
tbl_name [ [ AS ] alias]
join_table:
table_reference [ INNER | CROSS ] JOIN table_factor [join_condition]
| table_reference STRAIGHT_JOIN table_factor
| table_reference STRAIGHT_JOIN table_factor ON condition
| table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_condition
Join_condition:
ON conditional_expr | USING (column_list)
Union Syntax
For more information, see “UNION Syntax” in the MySQL 5.1 Reference Manual.
SELECT ….
UNION [ ALL | DISTINCT ] SELECT …
[ UNION [ ALL | DISTINCT ] SELECT … ]
Subqueries
For more information, see “Subquery Syntax” in the MySQL 5.1 Reference Manual.
• correlated subqueries
Query Performance
Due to Infobright’s column-oriented data organization and other Infobright-specific features,
query optimization in Infobright is slightly different than in traditional DBMS approaches.
• Infobright works well with data tables containing many columns, where only necessary
columns are accessed by query (as opposed to SELECT *). The traditional approach
suggests keeping records as small as possible (e.g., using schema normalization and table
decomposition). However, in Infobright, only necessary columns are used in calculations.
Therefore, queries with many limiting conditions on many columns of the same table are
especially well optimized in Infobright.
• Avoid using OR in queries and, if possible, use IN instead. In some cases ORs can be
translated to UNION ALL or IN, for example: “...WHERE a=1 OR a=2...“ could be replaced
by “...WHERE a IN (1,2)...”.
• Executing queries in steps may also help with missing function support. For instance,
execute the bulk of query in Infobright and export the data to MyISAM table. Then
To optimize your query performance, avoid the following which will result in the query
being handled by the MySQL query engine:
Backup Procedure
Use the following procedures to back up Infobright.
• To back up the Infobright databases, copy the entire directory containing the Infobright
databases (usually the data subdirectory in your Infobright installation directory).
• You can take advantage of incremental backups, since only some of the database files are
updated when new data is imported. Be sure to do a full backup occasionally.
Important: Some files in the KNFolder are updated when queries (using JOIN) are run so be
sure to back up the KNFolder on a regular basis.
Restore Procedure
To restore the Infobright databases from a backup copy, do the following:
1. Replace the entire data directory (usually the data subdirectory in your Infobright
installation directory) with the backup copy.
2. Replace the KNFolder with the backup copy (if the KNFolder is not inside the data
directory).
Important: Do not manually modify database files or move them from one database to
another—this may lead to data corruption and unpredictable results.
Equal = YES
IS No (MySQL engine)
IS NULL YES
COALESCE YES
IN YES
NOT IN YES
ISNULL YES
Logical Operators
OR, | | YES
CASE YES
IF YES
IFNULL YES
NULLIF YES
String Functions
ASCII YES
BIN YES
BIT_LENGTH YES
CHAR_LENGTH YES
CHARACTER_LENGTH YES
CONCAT YES
CONCAT_WS YES
CONV YES
ELT YES
EXPORT_SET YES
FIELD YES
FIND_IN_SET YES
FORMAT YES
HEX YES
INSTR YES
LCASE YES
LEFT YES
LENGTH YES
LOCATE YES
LOWER YES
LPAD YES
LTRIM YES
MAKE_SET YES
MID YES
OCT YES
OCTET_LENGTH YES
ORD YES
POSITION YES
QUOTE YES
REPEAT YES
REPLACE YES
REVERSE YES
RIGHT YES
RPAD YES
RTRIM YES
SOUNDEX YES
SPACE YES
SUBSTR YES
SUBSTRING YES
SUBSTRING_INDEX YES
TRIM YES
UCASE YES
UPPER YES
LIKE YES
RLIKE YES
REGEXP YES
STRCMP YES
Numeric Functions
Addition ( + ) YES
Subtraction ( - ) YES
Multiplication ( * ) YES
Division ( / ) YES
Modulo ( % ) YES
ABS YES
ACOS YES
ASIN YES
ATAN YES
CEIL YES
CEILING YES
CONV YES
COS YES
COT YES
DEGREES YES
EXP YES
FLOOR YES
LN YES
LOG10 YES
LOG2 YES
LOG YES
MOD YES
OCT YES
PI YES
POW YES
POWER YES
RADIANS YES
RAND YES
ROUND YES
SIGN YES
SIN YES
SQRT YES
TAN YES
TRUNCATE YES
ADDDATE YES
ADDTIME YES
CURDATE YES
CURRENT_DATE YES
CURRENT_TIME YES
CURRENT_TIMESTAMP YES
CURTIME YES
DATE YES
DATEDIFF YES
DATE_ADD YES
DATE_FORMAT YES
DATE_SUB YES
DAY YES
DAYNAME YES
DAYOFMONTH YES
DAYOFWEEK YES
DAYOFYEAR YES
EXTRACT YES
FROM_UNIXTIME YES
HOUR YES
LOCALTIME YES
LOCALTIMESTAMP YES
MINUTE YES
MONTH YES
MONTHNAME YES
NOW YES
PERIOD_ADD YES
PERIOD_DIFF YES
QUARTER YES
SECOND YES
SUBDATE YES
SUBTIME YES
SYSDATE YES
TIME YES
TIMEDIFF YES
TIME_FORMAT YES
TO_DAYS YES
UNIX_TIMESTAMP YES
UTC_DATE YES
UTC_TIME YES
WEEK YES
YEARWEEK YES
CAST YES
CONVERT YES
AVG YES
COUNT YES
MIN YES
MAX YES
STDDEV_POP YES
STDDEV_SAMP YES
SUM YES
VAR_POP YES
VAR_SAMP YES
VARIANCE YES
Group By Modifiers
Setting ControlMessages=4 in the brighthouse.ini will print the configuration settings to log
file.
Infobright includes a standalone application to adapt existing tables created prior to ICE 3.3.1
GA to UTF-8 capable structures. The Charset Migration Tool (CHMT) is in the Infobright bin
directory.
Executing CHMT:
chmt --datadir=/absolute/path/to/data/directory [other parameters]
conv-map Optional Absolute path to file with If not specified CHMT would try
collations conversions to use file: chmt-binary-folder/../
support-files/collations.txt ; if not
found there it would search for:
chmt-binary-folder/collations.txt
Log Structure
The logs detail information about every considered table found in a specified datadir. Each
conversion finishes with [NOT NEEDED], [PASS] or [FAILED] status.
Collations-conversion-file Structure
Each conversion directive is stored in one line of file:
collation_from_name;collation_from_id;collation_to_name;collation_to_id
For example:
big5_chinese_ci;1;binary;63
where both fields containing names are only informative (all conversions will be done using
only ids).
Infobright DomainExpert
The Infobright DomainExpert improves data compression and the performance of import,
queries and export. The DomainExpert allows you to define the composition of data,
particularly columns. The database then uses this information to optimize the storage of the
data and to reduce query processing time.
Decomposition Rules
Decomposition rules are the main DomainExpert objects. Each rule describes the composition
structure of values of a selected column expressed in a simple language. You can create,
modify and delete rules using the following stored procedures from the system database
sys_infobright:
CREATE_RULE(id, rule, comment)
UPDATE_RULE(id, rule)
CHANGE_RULE_COMMENT(id, comment)
DELETE_RULE(id)
where:
For example, to create a simple rule for email addresses, you would run the following
command:
CALL SYS_INFOBRIGHT.CREATE_RULE('EMAIL', '%s@%s', 'Rule for email addresses');
The rules are stored in the system table DECOMPOSITION_DICTIONARY. The list of all rules
defined in the system can be obtained with the following query:
SELECT * FROM SYS_INFOBRIGHT.DECOMPOSITION_DICTIONARY;
Examples:
• %s@%s decomposes an email address into the user name and the domain name
• %s://%s?%s decomposes a simple url with a query string into the scheme, the address and
the query string
As the percent sign (%) is a special character, to match it literally you can use a double percent
sign (%%). For example, to match exactly the text 10% humidity, the rule can be defined as
10%% humidity. However, the percent sign only has a special meaning if it is followed by the
letter s or d. Otherwise the percent sign has the literal meaning, so in the above example the
unmodified text 10% humidity is also a correct syntax of the exact rule.
There are two constraints on the rule syntax—the following ambiguous subsequences of
symbols are not allowed in rules:
• %s%s
• %d%d
The matching algorithm for rules is LAZY—the algorithm moves to the next primitive in the
rule as soon as possible. For example, for the text aa.bb.cc and the rule %s.%s, the first %s
is matched to aa and the second %s is matched to bb.cc. However, if the most lazy approach
fails, the algorithm searches back until the correct match is found or all the cases are traced.
For example, for the text aa.bb.11 and the rule %s.%d, the string %s is matched to aa.bb and
the number %d is matched to 11.
The current language is a simple, limited language that will be replaced with a much more
powerful language in the future. The current language does not support the following
regular expression constructs (these will be added in future releases):
• Sub-expressions
• Word boundaries
• Back-references, i.e. each group has a reference—$1 for the match of the first group, $2 for
the match of the second group and so on
Building recursive rules using the following operations is also not yet available:
• Concatenation: r1r2 where r1 and r2 are any pair of already defined rules—matches any
value that is concatenation of any pair of values, with v1 matching r1 and v2 matching r2
• Union (alternative): r1|r2 matches each value that matches one of r1 and r2
• Closure: r* matches each value which is any repetition of any value matching r
If you have data with IP addresses, this allows you to compare the performance of the
predefined IPv4 with IP decompositions expressible in the language—for example, with the
rule %d.%d.%d.%d.
The decomposition rules can be applied only to columns of string types that are not lookup
columns:
• CHAR
• VARCHAR
• BINARY
• VARBINARY
• TEXT
• TINYTEXT
You cannot set multiple rules on the same column. If the SET_DECOMPOSITION_RULE procedure
is called for a column with an already assigned rule, the previous rule is replaced with the
new rule.
To see the current decomposition rules for a particular table, use the SHOW_DECOMPOSITION
procedure. For example:
CALL SYS_INFOBRIGHT.SHOW_DECOMPOSITION('NETWORK', 'CONNECTION');
If a rule is assigned to a column, you cannot change or delete the rule from the
DECOMPOSITION_DICTIONARY system table.
DML commands:
• LOAD DATA
• INSERT
• UPDATE
If a rule is assigned to a column, instead of storing whole values, each value inserted into
the column is decomposed into the parts matching the subsequent occurrences of %s and
%d in the rule and the parts are compressed and stored in separate subcollections. Each
subcollection corresponds to one occurrence of %s or %d in the rule.
A value inserted into a column with a decomposition defined does not have to match the
rule. Such non-matching values are inserted into a separate subcollection. This subcollection
of outliers is compressed and stored independently of other subcollections.
You can obtain the accuracy of decomposition rules by setting the ControlMessages
parameter in the brighthouse.ini file to 2 (or higher):
ControlMessages = 2
If the parameter is set on each COMMIT for each column, Infobright reports the number of
outliers among the committed values (from INSERTs and LOADs). For example:
2011-05-25 16:59:03 Decomposition of ./network/connection.ip left 15 outliers.
Infobright also reports the change in the number of outliers for the updated values (from
UPDATEs), for example:
2011-05-25 16:59:46 The number of outliers increased by 2 after update(-s) on
network.connection.ip
2011-05-25 17:00:03 The number of outliers reduced by 3 after update(-s) on
network.connection.ip
Note Applying a decomposition rule DOES NOT always result in better compression
ratio and time. A decomposition rule may result in a worse compression ratio or
load and slower queries. To ensure decomposition improves performance, you can
compare load time, compression ratio and query time when loading the same data to
a table with a decomposition rule defined and to a table without decomposition.
The change applies only to new data. The old data remains decomposed with the previously
used rules. If the rule for a column is deleted, new values are stored without decomposition.
If a value is updated to a new value with an UPDATE command, then for the new value
Infobright uses the original rule used to decompose the old value. The currently assigned
rules are not used for UPDATEs.
Disable SElinux
SElinux is intended to protect Linux servers on the public internet such as Web Servers. It
provides an extra layer of security that isn’t really required for a back-end database server.
• In /etc/sysconfig/selinux add:
SELINUX=disabled
Swappiness
Set low swappiness to avoid unnecessary paging. This only helps for machines with low
levels of memory (say 4GB with 3GB allocated for Infobright).
• In /etc/rc.local add:
echo “7” > /proc/sys/vm/swappiness
Larger Readahead
• In /etc/rc.local add:
blockdev --setra 2048 /dev/sd<x>
Replace sd<x> with a proper device symbol (e.g. sdc); it should be the drive(s) on which
datadir and/or CacheFolder resides.
• In /etc/fstab add:
/dev/sdc1 /bha xfs noatime 1 2
Note: This is for data folders only. Linux boot partition can be ext3.
noatime
Use noatime options for mounting database and cache volumes (see below for details).
Otherwise the system will update the access time for files and directories (which degrades
performance).
Deadline Elevator
The default scheduler - CFQ - is 1% faster than elevator for a single user. However, in multi-
user test with 4 users, elevator had 20% better performance.
• In /etc/rc.local add:
echo “deadline” > /sys/block/sd<x>/queue/scheduler
Replace sd<x> with a proper device symbol (e.g. sdc); it should be the drive(s) on which
datadir and/or CacheFolder resides.
Increase ulimit to unlimited or 32,768 since the default file limit is 1024. This is insufficient for
large databases (lot’s of columns) or servers with multiple Infobright databases.
• To set it to a new value for this running session, which takes effect immediately, run
command:
# ulimit -n 8800
# ulimit -n -1 // for unlimited; recommended if server isn’t shared,
• Exit all shell sessions for the user you want to change limits on.
• As root, edit the file /etc/security/limits.conf and add these two lines toward the end:
user1 soft nofile 16000
user1 hard nofile 20000
The two lines above change the max number of file handles - nofile - to new settings.
To fix this, increase ulimit (see the previous section, “Increase ulimit to Support Large Data
Volume or Users”).