IBM Netezza SQL translation guide
IBM Netezza data warehousing is designed to work with Netezza-specific SQL syntax. Netezza SQL is based on Postgres 7.2. SQL scripts written for Netezza can't be used in a BigQuery data warehouse without alterations, because the SQL dialects vary.
This document details the similarities and differences in SQL syntax between Netezza and BigQuery in the following areas:
- Data types
- SQL language elements
- Query syntax
- Data manipulation language (DML)
- Data definition language (DDL)
- Stored procedures
- Functions
You can also use batch SQL translation to migrate your SQL scripts in bulk, or interactive SQL translation to translate ad-hoc queries. IBM Netezza SQL/NZPLSQL is supported by both tools in preview.
Data types
Netezza | BigQuery | Notes |
---|---|---|
INTEGER/INT/INT4 | INT64 | |
SMALLINT/INT2 | INT64 | |
BYTEINT/INT1 | INT64 | |
BIGINT/INT8 | INT64 | |
DECIMAL | NUMERIC | The DECIMAL data type in Netezza is an alias for the NUMERIC data type. |
NUMERIC | NUMERIC INT64 | |
NUMERIC(p,s) | NUMERIC | The NUMERIC type in BigQuery does not enforce custom digit or scale bounds (constraints) like Netezza does. BigQuery has fixed 9 digits after the decimal, while Netezza allows a custom setup. In Netezza, precision p can range from 1 to 38, and scale s from 0 to the precision. |
FLOAT(p) | FLOAT64 | |
REAL/FLOAT(6) | FLOAT64 | |
DOUBLE PRECISION/FLOAT(14) | FLOAT64 | |
CHAR/CHARACTER | STRING | The STRING type in BigQuery is variable-length and does not require manually setting a max character length as the Netezza CHARACTER and VARCHAR types require. The default value of n in CHAR(n) is 1. The maximum character string size is 64,000. |
VARCHAR | STRING | The STRING type in BigQuery is variable-length and does not require manually setting a max character length as the Netezza CHARACTER and VARCHAR types require. The maximum character string size is 64,000. |
NCHAR | STRING | The STRING type in BigQuery is stored as variable length UTF-8 encoded Unicode. The maximum length is 16,000 characters. |
NVARCHAR | STRING | The STRING type in BigQuery is stored as variable-length UTF-8-encoded Unicode. The maximum length is 16,000 characters. |
VARBINARY | BYTES | |
ST_GEOMETRY | GEOGRAPHY | |
BOOLEAN/BOOL | BOOL | The BOOL type in BigQuery can only accept TRUE/FALSE , unlike the BOOL type in Netezza, which can accept a variety of values like 0/1 , yes/no , true/false, on/off . |
DATE | DATE | |
TIME | TIME | |
TIMETZ/TIME WITH TIME ZONE | TIME | Netezza stores the TIME data type in UTC and allows you to pass an offset from UTC using the WITH TIME ZONE syntax. The TIME data type in BigQuery represents a time that's independent of any date or time zone. |
TIMESTAMP | DATETIME | The Netezza TIMESTAMP type does not include a time zone, the same as the BigQuery DATETIME type. |
ARRAY | There is no array data type in Netezza. The array type is instead stored in a varchar field. |
Timestamp and date type formatting
For more information about the date type formatting that Netezza SQL uses, see the Netezza Date Time template patterns documentation. For more information about the date time functions, see the Netezza date/time functions documentation.
When you convert date type formatting elements from Netezza to GoogleSQL, you must pay particular attention to time zone differences between TIMESTAMP
and DATETIME
, as summarized in the following table:
Netezza | BigQuery |
---|---|
CURRENT_TIMESTAMP CURRENT_TIME TIME information in Netezza can have different time zone information, which is defined using the WITH TIME ZONE syntax. | If possible, use the CURRENT_TIMESTAMP function, which is formatted correctly. However, the output format does not always show the UTC time zone (internally, BigQuery does not have a time zone). The DATETIME object in the bq command-line tool and Google Cloud console is formatted using a T separator according to RFC 3339. However, in Python and Java JDBC, a space is used as a separator. Use the explicit FORMAT_DATETIME function to define the date format correctly. Otherwise, an explicit cast is made to a string, for example:CAST(CURRENT_DATETIME() AS STRING) This also returns a space separator. |
CURRENT_DATE | CURRENT_DATE |
CURRENT_DATE-3 | BigQuery does not support arithmetic data operations. Instead, use the DATE_ADD function. |
SELECT
statement
Generally, the Netezza SELECT
statement is compatible with BigQuery. The following table contains a list of exceptions:
Netezza | BigQuery |
---|---|
A SELECT statement without FROM clause | Supports special case such as the following:
|
SELECT (subquery) AS flag, CASE WHEN flag = 1 THEN ... | In BigQuery, columns cannot reference the output of other columns defined within the same query. You must duplicate the logic or move the logic into a nested query. Option 1 SELECT (subquery) AS flag, CASE WHEN (subquery) = 1 THEN ... Option 2 SELECT q.*, CASE WHEN flag = 1 THEN ... FROM ( SELECT (subquery) AS flag, ... ) AS q |
Comparison operators
Netezza | BigQuery | Description |
---|---|---|
exp = exp2 | exp = exp2 | Equal |
exp <= exp2 | exp <= exp2 | Less than or equal to |
exp < exp2 | exp < exp2 | Less than |
exp <> exp2 exp != exp2 | exp <> exp2 exp != exp2 | Not equal |
exp >= exp2 | exp >= exp2 | Greater than or equal to |
exp > exp2 | exp > exp2 | Greater than |
Built-in SQL functions
Netezza | BigQuery | Description |
---|---|---|
CURRENT_DATE | CURRENT_DATE | Get the current date (year, month, and day). |
CURRENT_TIME | CURRENT_TIME | Get the current time with fraction. |
CURRENT_TIMESTAMP | CURRENT_TIMESTAMP | Get the current system date and time, to the nearest full second. |
NOW | CURRENT_TIMESTAMP | Get the current system date and time, to the nearest full second. |
COALESCE(exp, 0) | COALESCE(exp, 0) | Replace NULL with zero. |
NVL(exp, 0) | IFNULL(exp, 0) | Replace NULL with zero. |
EXTRACT(DOY FROM timestamp_expression) | EXTRACT(DAYOFYEAR FROM timestamp_expression) | Return the number of days from the beginning of the year. |
ADD_MONTHS(date_expr, num_expr) | DATE_ADD(date, INTERVAL k MONTH) | Add months to a date. |
DURATION_ADD(date, k) | DATE_ADD(date, INTERVAL k DAY) | Perform addition on dates. |
DURATION_SUBTRACT(date, k) | DATE_SUB(date, INTERVAL k DAY) | Perform subtraction on dates. |
str1 || str2 | CONCAT(str1, str2) | Concatenate strings. |
Functions
This section compares Netezza and BigQuery functions.
Aggregate functions
Analytical functions
Date and time functions
String functions
Math functions
DML syntax
This section compares Netezza and BigQuery DML syntax.
INSERT
statement
Netezza | BigQuery |
---|---|
INSERT INTO table VALUES (...); | INSERT INTO table (...) VALUES (...); Netezza offers a DEFAULT keyword and other constraints for columns. In BigQuery, omitting column names in the INSERT statement is valid only if all columns are given. |
INSERT INTO table (...) VALUES (...); INSERT INTO table (...) VALUES (...); | INSERT INTO table VALUES (), (); BigQuery imposes DML quotas, which restrict the number of DML statements you can execute daily. To make the best use of your quota, consider the following approaches:
|
DML scripts in BigQuery have slightly different consistency semantics than the equivalent statements in Netezza. Also note that BigQuery does not offer constraints apart from NOT NULL
.
For an overview of snapshot isolation and session and transaction handling, see Consistency guarantees and transaction isolation.
UPDATE
statement
In Netezza, the WHERE
clause is optional, but in BigQuery it is necessary.
Netezza | BigQuery |
---|---|
UPDATE tbl SET tbl.col1=val1; | Not supported without the WHERE clause. Use a WHERE true clause to update all rows. |
UPDATE A SET y = B.y, z = B.z + 1 FROM B WHERE A.x = B.x AND A.y IS NULL; | UPDATE A SET y = B.y, z = B.z + 1 FROM B WHERE A.x = B.x AND A.y IS NULL; |
UPDATE A alias SET x = x + 1 WHERE f(x) IN (0, 1) | UPDATE A SET x = x + 1 WHERE f(x) IN (0, 1); |
UPDATE A SET z = B.z FROM B WHERE A.x = B.x AND A.y = B.y | UPDATE A SET z = B.z FROM B WHERE A.x = B.x AND A.y = B.y; |
For examples, see UPDATE
examples.
Because of DML quotas, we recommend that you use larger MERGE
statements instead of multiple single UPDATE
and INSERT
statements. DML scripts in BigQuery have slightly different consistency semantics than equivalent statements in Netezza. For an overview of snapshot isolation and session and transaction handling, see Consistency guarantees and transaction isolation.
DELETE
and TRUNCATE
statements
The DELETE
and TRUNCATE
statements are both ways to remove rows from a table without affecting the table schema or indexes. The TRUNCATE
statement has the same effect as the DELETE
statement, but is much faster than the DELETE
statement for large tables. The TRUNCATE
statement is supported in Netezza but not supported in BigQuery. However, you can use DELETE
statements in both Netezza and BigQuery.
In BigQuery, the DELETE
statement must have a WHERE
clause. In Netezza, the WHERE
clause is optional. If the WHERE
clause is not specified, all the rows in the Netezza table are deleted.
Netezza | BigQuery | Description |
---|---|---|
BEGIN; LOCK TABLE A IN EXCLUSIVE MODE; DELETE FROM A; INSERT INTO A SELECT * FROM B; COMMIT; | Replacing the contents of a table with query output is the equivalent of a transaction. You can do this with either a query or a copy (cp ) operation.bq query \ bq cp \ | Replace the contents of a table with the results of a query. |
DELETE FROM database.table | DELETE FROM table WHERE TRUE; | In Netezza, when a delete statement is run, the rows are not deleted physically but only marked for deletion. Running the GROOM TABLE or nzreclaim commands later removes the rows marked for deletion and reclaims the corresponding disk space. |
GROOM TABLE | Netezza uses the GROOM TABLE command to reclaim disk space by removing rows marked for deletion. |
MERGE
statement
A MERGE
statement must match at most one source row for each target row. DML scripts in BigQuery have slightly different consistency semantics than the equivalent statements in Netezza. For an overview of snapshot isolation and session and transaction handling, see Consistency guarantees and transaction isolation. For examples, see BigQuery MERGE
examples and Netezza MERGE
examples.
DDL syntax
This section compares Netezza and BigQuery DDL syntax.
CREATE TABLE
statement
Netezza | BigQuery | Description |
---|---|---|
TEMP TEMPORARY | With BigQuery's DDL support, you can create a table from the results of a query and specify its expiration at creation time. For example, for three days:CREATE TABLE 'fh-bigquery.public_dump.vtemp' OPTIONS (expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 3 DAY)) | Create tables temporary to a session. |
ZONE MAPS | Not supported. | Quick search for WHERE condition. |
DISTRIBUTE ON | PARTITION BY | Partitioning. This is not a direct translation. DISTRIBUTE ON shares data between nodes, usually with a unique key for even distribution, while PARTITION BY prunes data into segments. |
ORGANIZE ON | CLUSTER BY | Both Netezza and BigQuery support up to four keys for clustering. Netezza clustered base tables (CBT) provide equal precedence to each of the clustering columns. BigQuery gives precedence to the first column on which the table is clustered, followed by the second column, and so on. |
ROW SECURITY | Authorized View | Row-level security. |
CONSTRAINT | Not supported | Check constraints. |
DROP
statement
Netezza | BigQuery | Description |
---|---|---|
DROP TABLE | DROP TABLE | |
DROP DATABASE | DROP DATABASE | |
DROP VIEW | DROP VIEW |
Column options and attributes
Netezza | BigQuery | Description |
---|---|---|
NULL NOT NULL | NULLABLE REQUIRED | Specify if the column is allowed to contain NULL values. |
REFERENCES | Not supported | Specify column constraint. |
UNIQUE | Not supported | Each value in the column must be unique. |
DEFAULT | Not supported | Default value for all values in the column. |
Temporary tables
Netezza supports TEMPORARY
tables that exist during the duration of a session.
To build a temporary table in BigQuery, do the following:
- Create a dataset that has a short time to live (for example, 12 hours).
Create the temporary table in the dataset, with a table name prefix of
temp
. For example, to create a table that expires in one hour, do this:CREATE TABLE temp.name (col1, col2, ...) OPTIONS(expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR));
Start reading and writing from the temporary table.
You can also remove duplicates independently in order to find errors in downstream systems.
Note that BigQuery does not support DEFAULT
and IDENTITY
(sequences) columns.
Procedural SQL statements
Netezza uses the NZPLSQL scripting language to work with stored procedures. NZPLSQL is based on Postgres' PL/pgSQL language. This section describes how to convert procedural SQL statements used in stored procedures, functions, and triggers from Netezza to BigQuery.
CREATE PROCEDURE
statement
Netezza and BigQuery both support creating stored procedures by using the CREATE PROCEDURE
statement. For more information, see Work with SQL stored procedures.
Variable declaration and assignment
Netezza | BigQuery | Description |
---|---|---|
DECLARE var datatype(len) [DEFAULT value]; | DECLARE | Declare variable. |
SET var = value; | SET | Assign value to variable. |
Exception handlers
Netezza supports exception handlers that can be triggered for certain error conditions. BigQuery does not support condition handlers.
Netezza | BigQuery | Description |
---|---|---|
EXCEPTION | Not supported | Declare SQL exception handler for general errors. |
Dynamic SQL statements
Netezza supports dynamic SQL queries inside stored procedures. BigQuery does not support dynamic SQL statements.
Netezza | BigQuery | Description |
---|---|---|
EXECUTE IMMEDIATE sql_str; | EXECUTE IMMEDIATE sql_str; | Execute dynamic SQL. |
Flow-of-control statements
Netezza | BigQuery | Description |
---|---|---|
IF THEN ELSE STATEMENT IF conditionTHEN ... ELSE ... END IF; | IF conditionTHEN ... ELSE ... END IF; | Execute conditionally. |
Iterative ControlFOR var AS SELECT ... DO stmts END FOR; FOR var AS cur CURSOR FOR SELECT ... DO stmts END FOR; | Not supported | Iterate over a collection of rows. |
Iterative ControlLOOP stmts END LOOP; | LOOP sql_statement_list END LOOP; | Loop block of statements. |
EXIT WHEN | BREAK | Exit a procedure. |
WHILE *condition* LOOP | WHILE conditionDO ... END WHILE | Execute a loop of statements until a while condition fails. |
Other statements and procedural language elements
Netezza | BigQuery | Description |
---|---|---|
CALL proc(param,...) | Not supported | Execute a procedure. |
EXEC proc(param,...) | Not supported | Execute a procedure. |
EXECUTE proc(param,...) | Not supported | Execute a procedure. |
Multi-statement and multi-line SQL statements
Both Netezza and BigQuery support transactions (sessions) and therefore support statements separated by semicolons that are consistently executed together. For more information, see Multi-statement transactions.
Other SQL statements
Netezza | BigQuery | Description |
---|---|---|
GENERATE STATISTICS | Generate statistics for all the tables in the current database. | |
GENERATE STATISTICS ON table_name | Generate statistics for a specific table. | |
GENERATE STATISTICS ON table_name(col1,col4) | Either use statistical functions like MIN, MAX, AVG, etc., use the UI, or use the Cloud Data Loss Prevention API. | Generate statistics for specific columns in a table. |
GENERATE STATISTICS ON table_name | APPROX_COUNT_DISTINCT(col) | Show the number of unique values for columns. |
INSERT INTO table_name | INSERT INTO table_name | Insert a row. |
LOCK TABLE table_name FOR EXCLUSIVE; | Not supported | Lock row. |
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL ... | BigQuery always uses Snapshot Isolation. For details, see Consistency guarantees and transaction isolation. | Define the transaction isolation level. |
BEGIN TRANSACTION END TRANSACTION COMMIT | BigQuery always uses Snapshot Isolation. For details, see Consistency guarantees and transaction isolation. | Define the transaction boundary for multi-statement requests. |
EXPLAIN ... | Not supported. Similar features in the query plan and timeline | Show query plan for a SELECT statement. |
User Views metadata System Views metadata | SELECT * EXCEPT(is_typed) FROM mydataset.INFORMATION_SCHEMA.TABLES; BigQuery Information Schema | Query objects in the database |
Consistency guarantees and transaction isolation
Both Netezza and BigQuery are atomic, that is, ACID compliant on a per-mutation level across many rows. For example, a MERGE
operation is completely atomic, even with multiple inserted values.
Transactions
Netezza syntactically accepts all four modes of ANSI SQL transaction isolation. However, regardless of what mode is specified, only the SERIALIZABLE
mode is used, which provides the highest possible level of consistency. This mode also avoids dirty, nonrepeatable, and phantom reads between concurrent transactions. Netezza does not use conventional locking to enforce consistency. Instead, it uses serialization dependency checking, a form of optimistic concurrency control to automatically roll back the latest transaction when two transactions attempt to modify the same data.
BigQuery also supports transactions. BigQuery helps ensure optimistic concurrency control (first to commit has priority) with snapshot isolation, in which a query reads the last committed data before the query starts. This approach guarantees the same level of consistency on a per-row, per-mutation basis and across rows within the same DML statement, yet avoids deadlocks. In the case of multiple DML updates against the same table, BigQuery switches to pessimistic concurrency control. Load jobs can run completely independently and append to tables.
Rollback
Netezza supports the ROLLBACK
statement to abort the current transaction and roll back all the changes made in the transaction.
In BigQuery, you can use the ROLLBACK TRANSACTION
statement.
Database limits
Limit | Netezza | BigQuery |
---|---|---|
Tables per database | 32,000 | Unrestricted |
Columns per table | 1600 | 10000 |
Maximum row size | 64 KB | 100 MB |
Column and table name length | 128 bytes | 16,384 Unicode characters |
Rows per table | Unlimited | Unlimited |
Maximum SQL request length | 1 MB (maximum unresolved standard SQL query length). 12 MB (maximum resolved legacy and standard SQL query length). : 10 MB (HTTP request size limit) 10,000 (maximum rows per request) | |
Maximum request and response size | 10 MB (request) and 10 GB (response) or virtually unlimited if using pagination or the Cloud Storage API. | |
Maximum number of concurrent sessions | 63 concurrent read-write transactions. 2000 concurrent connections to the server. | 100 concurrent queries (can be raised with slot reservation), 300 concurrent API requests per user. |
What's next
- Get step-by-step instructions to Migrate from IBM Netezza to BigQuery.