Apache ShardingSphere 5.1.0 Now Avaliable
Apache ShardingSphere 5.1.0 is officially released and available. The previous 5.0.0 GA version was launched in November last year, and marked ShardingSphere’s evolution from middleware to an ecosystem.
This meant gaining the power to transform any database in a distributed database system, and enhance it with features such as data sharding, distributed transaction, data encryption, SQL audit, database gateway, and more.
For the past three months, the ShardingSphere community received a lot of feedback from developers, partners, and users across different industries. We’d like to extend our gratitude for the feedback they provided, because, without it, this update would not be possible.
Our community author and Apache ShardingSphere PMC, Meng Haoran, shares with you in detail what’s new in Apache ShardingSphere version 5.1.0.
Based on user feedback from the 5.0.0 GA version, we also decided to commit our efforts to improve ShardingSphere’s ecosystem, kernel and feature modules:
Kernel
Building a powerful and stable kernel has always been the purpose of ShardingSphere.
In the new version we fix a large number of issues to better support parsing for PostgreSQL and openGauss SQL, and now support function parsing and binlog
statement parsing.
W also optimized the rewriter engine and improved efficiency for loading massive single tables, to further improve overall kernel performance. Moreover, ShardingSphere now adds the SQL hint
function that enables users to use the forced routing function more conveniently.
Access Terminal
For ShardingSphere-Proxy, we fix the issue of parsing MySQL/PostgreSQL protocol, while we also added SCRAM SHA-256
authentication mode to support openGauss and optimize the openGauss batch inserts protocol to improve the data insert performance.
For ShardingSphere-JDBC, we removed check for NULL
values in rules, so users can still use JDBC even if there is no value in rules. We also optimized the metadata of the logical database only loading the specified schemaName
to accelerate boot-up.
Elastic Scale-Out
We made many adjustments to elastic scale-out in this version.
First, the original scaling module is moved to the data-pipeline module under the kernel. In the future, this module will provide most data processing capabilities except for data migration.
Second, scaling configuration has been moved from server.yaml
to theconfig-sharding.yaml
configuration file. Together with data sharding, elastic scale-out will provide users with better data sharding services.
DistSQL
Many practical languages can now be implemented. More tools are provided for users to manage the ShardingSphere distributed database ecosystem.
Some distributed cluster governance capabilities are optimized as well. For example, when users enable/stop instances through instanceId
while there is only one secondary database, the users will be informed that they cannot stop the instances — significantly improving user experience.
Read/Write Splitting and High Availability
The API of read/write splitting and high availability are both optimized. Read/write splitting now supports both static and dynamic configurations, while the static configuration needs to be used with high availability.
The high availability configuration and algorithm are isolated, making its configuration more unified and concise. Additionally, SpringBoot
and Spring Namespace
now support the configuration of high availability as well as the implementation of openGauss’ high availability feature.
Shadow Database
The shadow database feature has been partly optimized. It now supports logic data source transmission, provides checking for data types that are not supported by column matching shadow algorithms, annotates that shadow algorithm is reconstructed as HINT
shadow algorithm, removes enable attribute in configuration, and optimizes the determining logic of shadow algorithm, improving performance.
This post only covers a a part of the updates we made to some functions. While developing version 5.1.0, we merged 1000+ PRs from the community. Based on version 5.0.0 GA, version 5.1.0 has been significantly improved in terms of its kernel capabilities, core functions, and performance to deliver a better user experience.
Here are the details of the release of version 5.1.0:
New Features
- Support SQL
hint
- New DistSQL syntax:
SHOW AUTHORITY RULE
- New DistSQL syntax:
SHOW TRANSACTION RULE
- New DistSQL syntax:
ALTER TRANSACTION RULE
- New DistSQL syntax:
SHOW SQL_PARSER RULE
- New DistSQL syntax:
ALTER SQL_PARSER RULE
- New DistSQL syntax:
ALTER DEFAULT SHARDING STRATEGY
- New DistSQL syntax:
DROP DEFAULT SHARDING STRATEGY
- New DistSQL syntax:
CREATE DEFAULT SINGLE TABLE RULE
- New DistSQL syntax:
SHOW SINGLE TABLES
- New DistSQL syntax:
SHOW SINGLE TABLE RULES
- New DistSQL syntax:
SHOW SHARDING TABLE NODES
- New DistSQL syntax:
CREATE/ALTER/DROP SHARDING KEY GENERATOR
- New DistSQL syntax:
SHOW SHARDING KEY GENERATORS
- New DistSQL syntax:
REFRESH TABLE METADATA
- New DistSQL syntax:
PARSE SQL
, Output the abstract syntax tree obtained by parsing SQL - New DistSQL syntax:
SHOW UNUSED SHARDING ALGORITHMS
- New DistSQL syntax:
SHOW UNUSED SHARDING KEY GENERATORS
- New DistSQL syntax:
CREATE/DROP SHARDING SCALING RULE
- New DistSQL syntax:
ENABLE/DISABLE SHARDING SCALING RULE
- New DistSQL syntax:
SHOW SHARDING SCALING RULES
- New DistSQL syntax:
SHOW INSTANCE MODE
- New DistSQL syntax:
COUNT SCHEMA RULES
- Scaling: Add
rateLimiter
configuration andQPS
TPS
implementation - Scaling: Add
DATA_MATCH
data consistency check - Scaling: Add
batchSize
configuration to avoid possible OOME - Scaling: Add
streamChannel
configuration andMEMORY
implementation - Scaling: Support MySQL
BINARY
data type - Scaling: Support MySQL
YEAR
data type - Scaling: Support PostgreSQL
BIT
data type - Scaling: Support PostgreSQL
MONEY
data type - Database discovery adds support for JDBC
Spring Boot
- Database discovery adds support for JDBC
Spring Namespace
- Database discovery adds support for openGauss
- Shadow DB adds support for logical data source transfer
- Add data type validator for column matching shadow algorithm
- Add support for
xa start/end/prepare/commit/recover
in encrypt case with only one data source
API Changes
- Redesign the database discovery related DistSQL syntax
- In DistSQL, the keyword
GENERATED_KEY
is adjusted toKEY_GENERATE_STRATEGY
- Native authority provider is marked as deprecated and will be removed in a future version
- Scaling: Move scaling configuration from
server.yaml
toconfig-sharding.yaml
- Scaling: Rename
clusterAutoSwitchAlgorithm
SPI tocompletionDetector
and refactor method parameter - Scaling: Data consistency check API method rename and return type change
- Database discovery module API refactoring
- Read/write-splitting supports static and dynamic configuration
- Shadow DB remove the enable configuration
- Shadow algorithm type modified
Enhancements
- Improve load multi single table performance
- Remove automatically added order by primary key clause
- Optimize binding table route logic without sharding column in join condition
- Support update sharding key when the sharding routing result keep the same
- Optimize rewrite engine performance
- Support select union/union all … statements by federation engine
- Support insert on duplicate key update sharding column when route context keep same
- Use union all to merge sql route units for simple select to improve performance
- Supports autocommit in
ShardingSphere-Proxy
- ShardingSphere openGauss Proxy supports
SHA-256
authentication method - Remove property
java.net.preferIPv4Stack=true
from Proxy startup script - Remove the verification of null rules for JDBC
- Optimize performance of executing openGauss batch bind
- Disable Netty resource leak detector by default
- Supports describe prepared statement in PostgreSQL / openGauss Proxy
- Optimize performance of executing PostgreSQL batched inserts
- Add
instance_id
to the result ofSHOW INSTANCE LIST
- Support to use
instance_id
to perform operations whenenable/disable
a proxy instance - Support auto creative algorithm when
CREATE SHARING TABLE RULE
, reducing the steps of creating rule - Support specifying an existing KeyGenerator when
CREATE SHARDING TABLE RULE
DROP DATABASE
supportsIF EXISTS
optionDATANODES
inSHARDING TABLE RULE
supports enumerated inline expressionsCREATE/ALTER SHARDING TABLE RULE
supports complex sharding algorithmSHOW SHARDING TABLE NODES
supports non-inline scenarios (range, time, etc.)- When there is only one read data source in the read/write-splitting rule, it is not allowed to be disabled
- Scaling: Add basic support of chunked streaming data consistency check
- Shadow algorithm decision logic optimization to improve performance
Refactoring
- Refactor federation engine scan table logic
- Avoid duplicated TCL SQL parsing when executing prepared statement in Proxy
- Scaling: Add pipeline modules to redesign scaling
- Scaling: Refactor several job configuration structure
- Scaling: Precalculate tasks splitting and persist in job configuration
- Scaling: Add basic support of pipeline-core code reuse for encryption job
- Scaling: Add basic support of scaling job and encryption job combined running
- Scaling: Add
input
andoutput
configuration, includingworkerThread
andrateLimiter
- Scaling: Move
blockQueueSize
intostreamChannel
- Scaling: Change
jobId
type from integer to text - Optimize JDBC to load only the specified schema
- Optimize meta data structure of the registry center
- Rename Note shadow algorithm to
HINT
shadow algorithm
Bug Fixes
- Support parsing function
- Fix alter table drop constrain
- Fix optimize table route
- Support Route resource group
- Support parsing binlog
- Support postgreSql/openGauss ‘&’ and ‘|’ operator
- Support parsing openGauss insert on duplicate key
- Support parse postgreSql/openGauss union
- Support query which table has column contains keyword
- Fix missing parameter in function
- Fix sub query table with no alias
- Fix utc
timestamp
function - Fix
alter encrypt
column - Support alter column with position
encrypt
column - Fix delete with schema for postgresql
- Fix wrong route result caused by Oracle parser ambiguity
- Fix projection count error when use sharding and encrypt
- Fix
npe
when using shadow andreadwrite_splitting
- Fix wrong metadata when actual table is case insensitive
- Fix encrypt rewrite exception when execute multiple table join query
- Fix encrypt rewrite wrong result with table level
queryWithCipherColumn
- Fix parsing chinese
- Fix encrypt exists sub query
- Fix full route caused by the MySQL
BINARY
keyword in the sharding condition - Fix
getResultSet
method empty result exception when usingJDBCMemoryQueryResult
processing statement - Fix incorrect shard table validation logic when creating store function/procedure
- Fix null charset exception occurs when connecting Proxy with some PostgreSQL client
- Fix executing commit in prepared statement cause transaction status incorrect in MySQL Proxy
- Fix client connected to Proxy may stuck if error occurred in PostgreSQL with non English locale
- Fix file not found when path of configurations contains blank character
- Fix transaction status may be incorrect cause by early flush
- Fix the unsigned datatype problem when query with
PrepareStatement
- Fix protocol violation in implementations of prepared statement in MySQL Proxy
- Fix caching too many connections in openGauss batch bind
- Fix the problem of missing data in
SHOW READWRITE_SPLITTING RULES
whendb-discovery
andreadwrite-splitting
are used together - Fix the problem of missing data in
SHOW READWRITE_SPLITTING READ RESOURCES
whendb-discovery
andreadwrite-splitting
are used together - Fix the
NPE
when theCREATE SHARDING TABLE RULE
statement does not specify the sub-database and sub-table strategy - Fix
NPE
whenPREVIEW SQL
byschema.table
- Fix
DISABLE
statement could disable readwrite-splitting write data source in some cases - Fix
DIABLE INSTANCE
could disable the current instance in some cases - Fix the issue that user may query the unauthorized logic schema when the provider is
SCHEMA_PRIVILEGES_PERMITTED
- Fix
NPE
when authority provider is not configured - Scaling: Fix DB connection leak on XA initialization which triggered by data consistency check
- Scaling: Fix PostgreSQL replication stream exception on multiple data sources
- Scaling: Fix migrating updated record exception on PostgreSQL incremental phase
- Scaling: Fix MySQL 5.5 check
BINLOG_ROW_IMAGE
option failure - Scaling: Fix PostgreSQL xml data type consistency check
- Fix database discovery failed to modify cron configuration
- Fix single read data source use weight
loadbalance
algorithm error - Fix create redundant data souce without memory mode
- Fix column value matching shadow algorithm data type conversion exception
Apache ShardingSphere Open Source Project Links:
Author
Haoran Meng
SphereEx Senior Development Engineer
Apache ShardingSphere PMC
Previously responsible for the database products R&D at JingDong Technology, he is passionate about Open-Source and database ecosystems. Currently, he focuses on the development of the ShardingSphere database ecosystem and open source community building.