继我们之前关于 SQL Server 数据压缩的问题之后,我们已经取得了良好的进展。对于其中一张表,我们将其从20 GB压缩到2 GB,因此这证明最佳压缩比可以达到10倍。
另一个表的大小(以表的保留空间衡量)从 72.55 GB 减少到 67.06 GB。因此,与上述最佳情况相比,其压缩优势还不够。
与上一个问题相同,我们正在为 Magento-v2 应用程序开发一个数据仓库,表名为msab_magento.sales_order_item
.
- 该表的各个列中还包含大量
null
值。 - 但是,该表包含
product_option
类型为 的列,nvarchar(max)
与 MySQL 源中类型为 的同一列相对应text
。而该列存储的是JSON格式的文本数据。每行,该列上的平均字符串长度为4923
,或9 KB
每个字符在 中占用两个字节nvarchar
。
关于JSON列,我们的初步想法是:
- 由于 SQL Server 页面级压缩作用于
8 KB
每个大小的各个页面。因此,页面大小小于列大小。因此,这可能会影响压缩。 - JSON 文档来自产品配置中的模板,其中填充了每笔交易的特定信息。如果连续交易不是针对同一产品,则它们的 JSON 字段将从不同的模板生成,并具有不同的密钥。这使得字典压缩算法的应用变得困难。
- 然而,在全局数据集中,每个产品都会有大量订单,因此 JSON 键仍然会重复很多,只是可能跨越比一页 ( ) 大小更长的范围
8 KB
。
我们的问题:
我们希望使数据压缩更好并且需要指导。还请提醒我们瓶颈是否在 JSON 列以外的其他地方。
我们想,这个
sales_order_item
表对应的是GitHub 上存储库中的这个模型,对吗?magento2
如果可能的话,我们还需要有关如何导航 Magento 源代码的指导。
另请参阅下面详细信息中我们数据仓库中表的 DDL,如有任何问题请告诉我。
我们非常感谢任何提示和建议。
细节:
/****** Object: Table [msab_magento].[sales_order_item] Script Date: 11/8/2023 5:08:01 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [msab_magento].[sales_order_item](
[item_id] [bigint] IDENTITY(3233417,1) NOT NULL,
[order_id] [bigint] NOT NULL,
[parent_item_id] [bigint] NULL,
[quote_item_id] [bigint] NULL,
[store_id] [int] NULL,
[created_at] [datetime] NOT NULL,
[updated_at] [datetime] NOT NULL,
[product_id] [bigint] NULL,
[product_type] [nvarchar](255) NULL,
[product_options] [nvarchar](max) NULL,
[weight] [decimal](12, 4) NULL,
[is_virtual] [int] NULL,
[sku] [nvarchar](255) NULL,
[name] [nvarchar](255) NULL,
[description] [nvarchar](max) NULL,
[applied_rule_ids] [nvarchar](max) NULL,
[additional_data] [nvarchar](max) NULL,
[is_qty_decimal] [int] NULL,
[no_discount] [int] NOT NULL,
[qty_backordered] [decimal](12, 4) NULL,
[qty_canceled] [decimal](12, 4) NULL,
[qty_invoiced] [decimal](12, 4) NULL,
[qty_ordered] [decimal](12, 4) NULL,
[qty_refunded] [decimal](12, 4) NULL,
[qty_shipped] [decimal](12, 4) NULL,
[base_cost] [decimal](12, 4) NULL,
[price] [decimal](12, 4) NOT NULL,
[base_price] [decimal](12, 4) NOT NULL,
[original_price] [decimal](12, 4) NULL,
[base_original_price] [decimal](12, 4) NULL,
[tax_percent] [decimal](12, 4) NULL,
[tax_amount] [decimal](20, 4) NULL,
[base_tax_amount] [decimal](20, 4) NULL,
[tax_invoiced] [decimal](20, 4) NULL,
[base_tax_invoiced] [decimal](20, 4) NULL,
[discount_percent] [decimal](12, 4) NULL,
[discount_amount] [decimal](20, 4) NULL,
[base_discount_amount] [decimal](20, 4) NULL,
[discount_invoiced] [decimal](20, 4) NULL,
[base_discount_invoiced] [decimal](20, 4) NULL,
[amount_refunded] [decimal](20, 4) NULL,
[base_amount_refunded] [decimal](20, 4) NULL,
[row_total] [decimal](20, 4) NOT NULL,
[base_row_total] [decimal](20, 4) NOT NULL,
[row_invoiced] [decimal](20, 4) NOT NULL,
[base_row_invoiced] [decimal](20, 4) NOT NULL,
[row_weight] [decimal](12, 4) NULL,
[base_tax_before_discount] [decimal](20, 4) NULL,
[tax_before_discount] [decimal](20, 4) NULL,
[ext_order_item_id] [nvarchar](255) NULL,
[locked_do_invoice] [int] NULL,
[locked_do_ship] [int] NULL,
[price_incl_tax] [decimal](20, 4) NULL,
[base_price_incl_tax] [decimal](20, 4) NULL,
[row_total_incl_tax] [decimal](20, 4) NULL,
[base_row_total_incl_tax] [decimal](20, 4) NULL,
[discount_tax_compensation_amount] [decimal](20, 4) NULL,
[base_discount_tax_compensation_amount] [decimal](20, 4) NULL,
[discount_tax_compensation_invoiced] [decimal](20, 4) NULL,
[base_discount_tax_compensation_invoiced] [decimal](20, 4) NULL,
[discount_tax_compensation_refunded] [decimal](20, 4) NULL,
[base_discount_tax_compensation_refunded] [decimal](20, 4) NULL,
[tax_canceled] [decimal](12, 4) NULL,
[discount_tax_compensation_canceled] [decimal](20, 4) NULL,
[tax_refunded] [decimal](20, 4) NULL,
[base_tax_refunded] [decimal](20, 4) NULL,
[discount_refunded] [decimal](20, 4) NULL,
[base_discount_refunded] [decimal](20, 4) NULL,
[free_shipping] [int] NOT NULL,
[qty_returned] [decimal](12, 4) NOT NULL,
[gift_message_id] [int] NULL,
[gift_message_available] [int] NULL,
[weee_tax_applied] [nvarchar](max) NULL,
[weee_tax_applied_amount] [decimal](12, 4) NULL,
[weee_tax_applied_row_amount] [decimal](12, 4) NULL,
[weee_tax_disposition] [decimal](12, 4) NULL,
[weee_tax_row_disposition] [decimal](12, 4) NULL,
[base_weee_tax_applied_amount] [decimal](12, 4) NULL,
[base_weee_tax_applied_row_amnt] [decimal](12, 4) NULL,
[base_weee_tax_disposition] [decimal](12, 4) NULL,
[base_weee_tax_row_disposition] [decimal](12, 4) NULL,
[gw_id] [int] NULL,
[gw_base_price] [decimal](12, 4) NULL,
[gw_price] [decimal](12, 4) NULL,
[gw_base_tax_amount] [decimal](12, 4) NULL,
[gw_tax_amount] [decimal](12, 4) NULL,
[gw_base_price_invoiced] [decimal](12, 4) NULL,
[gw_price_invoiced] [decimal](12, 4) NULL,
[gw_base_tax_amount_invoiced] [decimal](12, 4) NULL,
[gw_tax_amount_invoiced] [decimal](12, 4) NULL,
[gw_base_price_refunded] [decimal](12, 4) NULL,
[gw_price_refunded] [decimal](12, 4) NULL,
[gw_base_tax_amount_refunded] [decimal](12, 4) NULL,
[gw_tax_amount_refunded] [decimal](12, 4) NULL,
[event_id] [int] NULL,
[giftregistry_item_id] [int] NULL,
[business_area] [nvarchar](max) NULL,
[custom_options_has_private_data] [smallint] NULL,
[auto_ship] [smallint] NULL,
CONSTRAINT [PK_sales_order_item_item_id] PRIMARY KEY CLUSTERED
(
[item_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 95, DATA_COMPRESSION = PAGE) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0)) FOR [order_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [parent_item_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [quote_item_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [store_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (getdate()) FOR [created_at]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (getdate()) FOR [updated_at]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [product_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [product_type]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [weight]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [is_virtual]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [sku]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [name]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [is_qty_decimal]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0)) FOR [no_discount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_backordered]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_canceled]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_ordered]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_shipped]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_cost]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [price]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_price]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [original_price]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_original_price]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [tax_percent]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [tax_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_tax_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [tax_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_tax_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [discount_percent]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [discount_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_discount_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [discount_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_discount_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [amount_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_amount_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [row_total]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_row_total]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [row_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [base_row_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [row_weight]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_tax_before_discount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [tax_before_discount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [ext_order_item_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [locked_do_invoice]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [locked_do_ship]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [price_incl_tax]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_price_incl_tax]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [row_total_incl_tax]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_row_total_incl_tax]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [discount_tax_compensation_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_discount_tax_compensation_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [discount_tax_compensation_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_discount_tax_compensation_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [discount_tax_compensation_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_discount_tax_compensation_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [tax_canceled]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [discount_tax_compensation_canceled]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [tax_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_tax_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [discount_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_discount_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0)) FOR [free_shipping]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0.0000)) FOR [qty_returned]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gift_message_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gift_message_available]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [weee_tax_applied_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [weee_tax_applied_row_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [weee_tax_disposition]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [weee_tax_row_disposition]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_weee_tax_applied_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_weee_tax_applied_row_amnt]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_weee_tax_disposition]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [base_weee_tax_row_disposition]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_base_price]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_price]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_base_tax_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_tax_amount]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_base_price_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_price_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_base_tax_amount_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_tax_amount_invoiced]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_base_price_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_price_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_base_tax_amount_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [gw_tax_amount_refunded]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [event_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [giftregistry_item_id]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT (NULL) FOR [custom_options_has_private_data]
GO
ALTER TABLE [msab_magento].[sales_order_item] ADD DEFAULT ((0)) FOR [auto_ship]
GO
EXEC sys.sp_addextendedproperty @name=N'MS_SSMA_SOURCE', @value=N'msab_magento.sales_order_item' , @level0type=N'SCHEMA',@level0name=N'msab_magento', @level1type=N'TABLE',@level1name=N'sales_order_item'
GO
请记住,您的 varchar(max) 存储在表外,并且不会使用“轻按开关”压缩模式进行压缩。
考虑
a) 取消选择 JSON 并将其存储为关系数据(如果可能)
b) 将列更改为 varbinary(max) 并在代码级别添加压缩步骤,作为触发器或 SP。 https://learn.microsoft.com/en-us/sql/t-sql/functions/compress-transact-sql?view=sql-server-ver16
To be honest, a 20GB database does not sound worth compressing.