AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / dba / 问题 / 237394
Accepted
kokomaru
kokomaru
Asked: 2019-05-05 11:57:43 +0800 CST2019-05-05 11:57:43 +0800 CST 2019-05-05 11:57:43 +0800 CST

为什么更新我们的数据库需要这么长时间?

  • 772

我目前有一个小应用程序接收 CSV 并调用一个程序来更新/插入记录,以使我们的 MySQL 5.6.39 数据库与我们使用的外部 Web 应用程序的数据库保持同步(我们没有直接访问权限或掌控)。我们每周更新一次,现在可能需要一天的大部分时间才能完成(约 60,000 条记录),所以我必须以某种方式加快速度。

这是我们使用的程序之一:

DELIMITER //

CREATE PROCEDURE `upsert-debtors` (
    internalCaseIDParam VARCHAR(50),
    caseNumberParam VARCHAR(50),
    bankruptcyCaseNumberParam VARCHAR(50),
    bankruptcyChapterNumberParam VARCHAR(50),
    bankruptcyDateFiledParam VARCHAR(50),
    bankruptcyDischargeDateParam VARCHAR(50),
    bankruptcyDismissalDateParam VARCHAR(50),
    accountReceivedDateParam VARCHAR(50),
    clientCommissionRateParam VARCHAR(50),
    clientFullNameParam VARCHAR(50),
    clientNameParam VARCHAR(50),
    collectionStatusDateParam VARCHAR(50),
    collectorUserNameParam VARCHAR(50),
    creditorParam VARCHAR(50),
    currentBalanceDueParam VARCHAR(50),
    currentClaimStatusParam VARCHAR(50),
    currentCostsParam VARCHAR(50),
    currentFeesParam VARCHAR(50),
    currentInterestParam VARCHAR(50),
    currentPerDiemParam VARCHAR(50),
    currentPrincipalParam VARCHAR(50),
    dateEnteredInSimplicityParam VARCHAR(50),
    debtorAkaParam VARCHAR(50),
    debtorAddressOneParam VARCHAR(50),
    debtorAddressTwoParam VARCHAR(50),
    debtorAddressStatusParam VARCHAR(50),
    debtorCellParam VARCHAR(50),
    debtorCityParam VARCHAR(50),
    debtorCompanyContactParam VARCHAR(50),
    debtorCompanyNameParam VARCHAR(50),
    debtorDobParam VARCHAR(50),
    debtorEmailParam VARCHAR(50),
    debtorFaxParam VARCHAR(50),
    debtorFirstNameParam VARCHAR(50),
    debtorFullNameParam VARCHAR(50),
    debtorLastNameParam VARCHAR(50),
    debtorLinkedBalanceParam VARCHAR(50),
    debtorMiddleNameParam VARCHAR(50),
    debtorOtherPhoneParam VARCHAR(50),
    debtorPhoneParam VARCHAR(50),
    debtorPrefixParam VARCHAR(50),
    debtorSSNParam VARCHAR(50),
    debtorStateParam VARCHAR(50),
    debtorSuffixParam VARCHAR(50),
    debtorZipParam VARCHAR(50),
    docketNumberParam VARCHAR(50),
    employerNameParam VARCHAR(50),
    employerPhoneParam VARCHAR(50),
    firstDelinquencyDateParam VARCHAR(50),
    hotZipParam VARCHAR(50),
    importNotesParam VARCHAR(250),
    lastAccessDateParam VARCHAR(50),
    lastDateReportedToEquifaxParam VARCHAR(50),
    lastDateReportedToExperianParam VARCHAR(50),
    lastDateReportedToTransunionParam VARCHAR(50),
    lastPaymentAmountParam VARCHAR(50),
    lastPaymentDateParam VARCHAR(50),
    lastPaymentNettedAmountParam VARCHAR(50),
    lastWorkDateParam VARCHAR(50),
    neverReportToCreditBureausParam VARCHAR(50),
    nextWorkDateParam VARCHAR(50),
    oldSysAcctSParam VARCHAR(50),
    isClosedParam VARCHAR(50),
    originalClaimAmountParam VARCHAR(50),
    originalClaimInterestRateParam VARCHAR(50),
    originatedDateParam VARCHAR(50),
    patientParam VARCHAR(50),
    patientDobParam VARCHAR(50),
    patientSsnParam VARCHAR(50),
    payerParam VARCHAR(50),
    paymentPlanAmount1Param VARCHAR(50),
    paymentPlanAmount2Param VARCHAR(50),
    paymentPlanAmount3Param VARCHAR(50),
    paymentPlanAmount4Param VARCHAR(50),
    paymentPlanAmount5Param VARCHAR(50),
    paymentPlanAmount6Param VARCHAR(50),
    nextPaymentPlanDueAmountParam VARCHAR(50),
    paymentPlanDate1Param VARCHAR(50),
    paymentPlanDate2Param VARCHAR(50),
    paymentPlanDate3Param VARCHAR(50),
    paymentPlanDate4Param VARCHAR(50),
    paymentPlanDate5Param VARCHAR(50),
    paymentPlanDate6Param VARCHAR(50),
    nextPaymentPlanDueDateParam VARCHAR(50),
    pmtPlanOverdueParam VARCHAR(50),
    physicianParam VARCHAR(50),
    pifCatalystParam VARCHAR(50),
    referringAttorneyNameParam VARCHAR(50),
    totalAttorneyFeesParam VARCHAR(50),
    totalClaimAmountParam VARCHAR(50),
    totalCostsParam VARCHAR(50),
    totalInterestParam VARCHAR(50),
    totalPaymentsParam VARCHAR(50),
    totalPaymentsPreAndPostParam VARCHAR(50),
    workflowBandParam VARCHAR(50),
    clientClaimNumberParam VARCHAR(50)
)
BEGIN
    UPDATE
        debtors
    SET
        case_number = caseNumberParam,
        bankruptcy_case_number = bankruptcyCaseNumberParam,
        bankruptcy_chapter_number = bankruptcyChapterNumberParam,
        bankruptcy_date_filed = bankruptcyDateFiledParam,
        bankruptcy_discharge_date = bankruptcyDischargeDateParam,
        bankruptcy_dismissal_date = bankruptcyDismissalDateParam,
        account_received_date = accountReceivedDateParam,
        client_commission_rate = clientCommissionRateParam,
        client_full_name = clientFullNameParam,
        client_name = clientNameParam,
        collection_status_date = collectionStatusDateParam,
        collector_user_name = collectorUserNameParam,
        creditor = creditorParam,
        current_balance_due = currentBalanceDueParam,
        current_claim_status = currentClaimStatusParam,
        current_costs = currentCostsParam,
        current_fees = currentFeesParam,
        current_interest = currentInterestParam,
        current_per_diem = currentPerDiemParam,
        current_principal = currentPrincipalParam,
        date_entered_in_simplicity = dateEnteredInSimplicityParam,
        debtor_aka = debtorAkaParam,
        debtor_address_one = debtorAddressOneParam,
        debtor_address_two = debtorAddressTwoParam,
        debtor_address_status = debtorAddressStatusParam,
        debtor_cell = debtorCellParam,
        debtor_city = debtorCityParam,
        debtor_company_contact = debtorCompanyContactParam,
        debtor_company_name = debtorCompanyNameParam,
        debtor_dob = debtorDobParam,
        debtor_email = debtorEmailParam,
        debtor_fax = debtorFaxParam,
        debtor_first_name = debtorFirstNameParam,
        debtor_full_name = debtorFullNameParam,
        debtor_last_name = debtorLastNameParam,
        debtor_linked_balance = debtorLinkedBalanceParam,
        debtor_middle_name = debtorMiddleNameParam,
        debtor_other_phone = debtorOtherPhoneParam,
        debtor_phone = debtorPhoneParam,
        debtor_prefix = debtorPrefixParam,
        debtor_s_s_n = debtorSSNParam,
        debtor_state = debtorStateParam,
        debtor_suffix = debtorSuffixParam,
        debtor_zip = debtorZipParam,
        docket_number = docketNumberParam,
        employer_name = employerNameParam,
        employer_phone = employerPhoneParam,
        first_delinquency_date = firstDelinquencyDateParam,
        hot_zip = hotZipParam,
        import_notes = importNotesParam,
        last_access_date = lastAccessDateParam,
        last_date_reported_to_equifax = lastDateReportedToEquifaxParam,
        last_date_reported_to_experian = lastDateReportedToExperianParam,
        last_date_reported_to_transunion = lastDateReportedToTransunionParam,
        last_payment_amount = lastPaymentAmountParam,
        last_payment_date = lastPaymentDateParam,
        last_payment_netted_amount = lastPaymentNettedAmountParam,
        last_work_date = lastWorkDateParam,
        never_report_to_credit_bureaus = neverReportToCreditBureausParam,
        next_work_date = nextWorkDateParam,
        old_sys_acct_s = oldSysAcctSParam,
        is_closed = isClosedParam,
        original_claim_amount = originalClaimAmountParam,
        original_claim_interest_rate = originalClaimInterestRateParam,
        originated_date = originatedDateParam,
        patient = patientParam,
        patient_dob = patientDobParam,
        patient_ssn = patientSsnParam,
        payer = payerParam,
        payment_plan_amount_1 = paymentPlanAmount1Param,
        payment_plan_amount_2 = paymentPlanAmount2Param,
        payment_plan_amount_3 = paymentPlanAmount3Param,
        payment_plan_amount_4 = paymentPlanAmount4Param,
        payment_plan_amount_5 = paymentPlanAmount5Param,
        payment_plan_amount_6 = paymentPlanAmount6Param,
        next_payment_plan_due_amount = nextPaymentPlanDueAmountParam,
        payment_plan_date_1 = paymentPlanDate1Param,
        payment_plan_date_2 = paymentPlanDate2Param,
        payment_plan_date_3 = paymentPlanDate3Param,
        payment_plan_date_4 = paymentPlanDate4Param,
        payment_plan_date_5 = paymentPlanDate5Param,
        payment_plan_date_6 = paymentPlanDate6Param,
        next_payment_plan_due_date = nextPaymentPlanDueDateParam,
        pmt_plan_overdue = pmtPlanOverdueParam,
        physician = physicianParam,
        pif_catalyst = pifCatalystParam,
        referring_attorney_name = referringAttorneyNameParam,
        total_attorney_fees = totalAttorneyFeesParam,
        total_claim_amount = totalClaimAmountParam,
        total_costs = totalCostsParam,
        total_interest = totalInterestParam,
        total_payments = totalPaymentsParam,
        total_payments_pre_and_post = totalPaymentsPreAndPostParam,
        workflow_band = workflowBandParam,
        client_claim_number = clientClaimNumberParam
    WHERE internal_case_id = internalCaseIDParam;

    IF ROW_COUNT() = 0 THEN
        INSERT INTO
          debtors (
            internal_case_id,
            case_number,
            bankruptcy_case_number,
            bankruptcy_chapter_number,
            bankruptcy_date_filed,
            bankruptcy_discharge_date,
            bankruptcy_dismissal_date,
            account_received_date,
            client_commission_rate,
            client_full_name,
            client_name,
            collection_status_date,
            collector_user_name,
            creditor,
            current_balance_due,
            current_claim_status,
            current_costs,
            current_fees,
            current_interest,
            current_per_diem,
            current_principal,
            date_entered_in_simplicity,
            debtor_aka,
            debtor_address_one,
            debtor_address_two,
            debtor_address_status,
            debtor_cell,
            debtor_city,
            debtor_company_contact,
            debtor_company_name,
            debtor_dob,
            debtor_email,
            debtor_fax,
            debtor_first_name,
            debtor_full_name,
            debtor_last_name,
            debtor_linked_balance,
            debtor_middle_name,
            debtor_other_phone,
            debtor_phone,
            debtor_prefix,
            debtor_s_s_n,
            debtor_state,
            debtor_suffix,
            debtor_zip,
            docket_number,
            employer_name,
            employer_phone,
            first_delinquency_date,
            hot_zip,
            import_notes,
            last_access_date,
            last_date_reported_to_equifax,
            last_date_reported_to_experian,
            last_date_reported_to_transunion,
            last_payment_amount,
            last_payment_date,
            last_payment_netted_amount,
            last_work_date,
            never_report_to_credit_bureaus,
            next_work_date,
            old_sys_acct_s,
            is_closed,
            original_claim_amount,
            original_claim_interest_rate,
            originated_date,
            patient,
            patient_dob,
            patient_ssn,
            payer,
            payment_plan_amount_1,
            payment_plan_amount_2,
            payment_plan_amount_3,
            payment_plan_amount_4,
            payment_plan_amount_5,
            payment_plan_amount_6,
            next_payment_plan_due_amount,
            payment_plan_date_1,
            payment_plan_date_2,
            payment_plan_date_3,
            payment_plan_date_4,
            payment_plan_date_5,
            payment_plan_date_6,
            next_payment_plan_due_date,
            pmt_plan_overdue,
            physician,
            pif_catalyst,
            referring_attorney_name,
            total_attorney_fees,
            total_claim_amount,
            total_costs,
            total_interest,
            total_payments,
            total_payments_pre_and_post,
            workflow_band,
            client_claim_number
          )
        VALUES
          (         
            internalCaseIDParam,
            caseNumberParam,
            bankruptcyCaseNumberParam,
            bankruptcyChapterNumberParam,
            bankruptcyDateFiledParam,
            bankruptcyDischargeDateParam,
            bankruptcyDismissalDateParam,
            accountReceivedDateParam,
            clientCommissionRateParam,
            clientFullNameParam,
            clientNameParam,
            collectionStatusDateParam,
            collectorUserNameParam,
            creditorParam,
            currentBalanceDueParam,
            currentClaimStatusParam,
            currentCostsParam,
            currentFeesParam,
            currentInterestParam,
            currentPerDiemParam,
            currentPrincipalParam,
            dateEnteredInSimplicityParam,
            debtorAkaParam,
            debtorAddressOneParam,
            debtorAddressTwoParam,
            debtorAddressStatusParam,
            debtorCellParam,
            debtorCityParam,
            debtorCompanyContactParam,
            debtorCompanyNameParam,
            debtorDobParam,
            debtorEmailParam,
            debtorFaxParam,
            debtorFirstNameParam,
            debtorFullNameParam,
            debtorLastNameParam,
            debtorLinkedBalanceParam,
            debtorMiddleNameParam,
            debtorOtherPhoneParam,
            debtorPhoneParam,
            debtorPrefixParam,
            debtorSSNParam,
            debtorStateParam,
            debtorSuffixParam,
            debtorZipParam,
            docketNumberParam,
            employerNameParam,
            employerPhoneParam,
            firstDelinquencyDateParam,
            hotZipParam,
            importNotesParam,
            lastAccessDateParam,
            lastDateReportedToEquifaxParam,
            lastDateReportedToExperianParam,
            lastDateReportedToTransunionParam,
            lastPaymentAmountParam,
            lastPaymentDateParam,
            lastPaymentNettedAmountParam,
            lastWorkDateParam,
            neverReportToCreditBureausParam,
            nextWorkDateParam,
            oldSysAcctSParam,
            isClosedParam,
            originalClaimAmountParam,
            originalClaimInterestRateParam,
            originatedDateParam,
            patientParam,
            patientDobParam,
            patientSsnParam,
            payerParam,
            paymentPlanAmount1Param,
            paymentPlanAmount2Param,
            paymentPlanAmount3Param,
            paymentPlanAmount4Param,
            paymentPlanAmount5Param,
            paymentPlanAmount6Param,
            nextPaymentPlanDueAmountParam,
            paymentPlanDate1Param,
            paymentPlanDate2Param,
            paymentPlanDate3Param,
            paymentPlanDate4Param,
            paymentPlanDate5Param,
            paymentPlanDate6Param,
            nextPaymentPlanDueDateParam,
            pmtPlanOverdueParam,
            physicianParam,
            pifCatalystParam,
            referringAttorneyNameParam,
            totalAttorneyFeesParam,
            totalClaimAmountParam,
            totalCostsParam,
            totalInterestParam,
            totalPaymentsParam,
            totalPaymentsPreAndPostParam,
            workflowBandParam,
            clientClaimNumberParam
          );
    END IF;
END //

DELIMITER ;

就程序的设置或某些数据库设置而言,是否有任何会导致它像现在这样慢的东西?

谢谢!

mysql upsert
  • 4 4 个回答
  • 69 Views

4 个回答

  • Voted
  1. kevinnwhat
    2019-05-05T14:07:04+08:002019-05-05T14:07:04+08:00

    看起来这个过程将为每个 internal_case_id 调用。如果您有数千个 internal_case_id,那么为每个调用此存储过程可能会比您将其转换为基于集合的更新语句更慢。

    For example if you could pull all the records from the csv into a temp table then you can join that table to your main table for your update statement, and you can have a separate insert statement for all internal_case_id that do not exist in your target table, calling the stored procedure once to achieve both.

    • 0
  2. danblack
    2019-05-05T14:51:58+08:002019-05-05T14:51:58+08:00

    Showing the debtors table structure would be useful. Assuming internal_case_id isn't a unique or primary key you should make it become one.

    As you are loading from a CSV file, a stored procedure for each entry and the code that runs through it isn't necessary as there is a LOAD DATA explicitly for this purpose.

    LOAD DATA [LOCAL] INFILE 'file.csv' IGNORE
    INTO debtors
    ...
    

    Remove or un[] LOCAL depending on if the file is on the server or the client side.

    The IGNORE is the important part of the syntax as it uses the UNIQUE or PRIMARY KEY to not override existing internal_case_id entries in the same way as your stored procedure did.

    • 0
  3. Best Answer
    Rick James
    2019-05-05T22:22:19+08:002019-05-05T22:22:19+08:00

    Databases are good at doing things en masse; they are slow at doing things one at a time.

    1. LOAD DATA (or otherwise load the data) into a temporary table
    2. Use IODKU:

      INSERT INTO debtors (...)
          SELECT ...
              FROM temp_table
          ON DUPLICATE KEY UPDATE ...;
      

    To use IODKU, there needs to be a UNIQUE (or PRIMARY) key so that it knows which row to test for the existence of.

    Be sure to use VALUES(col) in the UPDATE part. (See the documentation.)

    You should seriously consider splitting that very-wide table into a few tables such as bankruptcy, debtor, last_payment, etc. In doing so, you will quickly discover that you often don't need to update all hundred columns.

    Also see what might reasonably be 'normalized'. Both of these will make the IODKU step more complex.

    payment_plan seems to have a limit of 6. And probably has most rows empty. It would be better to have the 6 as rows, not columns. A particular debtor would have 0 or more rows, not limited to 6.

    When you are running the update, does it interfere with other activities? If so, chunk it into groups of 100 rows at a time.

    • 0
  4. Wilson Hauck
    2019-05-13T17:05:02+08:002019-05-13T17:05:02+08:00

    Suggestion to consider prior to your Weekly Update

    SET GLOBAL innodb_change_buffer_max_size=50;
    

    to enable more in RAM storage of changes until weekly update is finished.

    When done,

    SET GLOBAL innodb_change_buffer_max_size=25;
    

    for default set aside.

    Disclaimer: I am the content author of website mentioned in my profile, Networkprofile, additional suggestions and FREE Utility Scripts are available.

    • 0

相关问题

  • 是否有任何 MySQL 基准测试工具?[关闭]

  • 我在哪里可以找到mysql慢日志?

  • 如何优化大型数据库的 mysqldump?

  • 什么时候是使用 MariaDB 而不是 MySQL 的合适时机,为什么?

  • 组如何跟踪数据库架构更改?

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目

    • 12 个回答
  • Marko Smith

    如何让sqlplus的输出出现在一行中?

    • 3 个回答
  • Marko Smith

    选择具有最大日期或最晚日期的日期

    • 3 个回答
  • Marko Smith

    如何列出 PostgreSQL 中的所有模式?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Jin 连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane 如何列出 PostgreSQL 中的所有模式? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve