我目前有一个小应用程序接收 CSV 并调用一个程序来更新/插入记录,以使我们的 MySQL 5.6.39 数据库与我们使用的外部 Web 应用程序的数据库保持同步(我们没有直接访问权限或掌控)。我们每周更新一次,现在可能需要一天的大部分时间才能完成(约 60,000 条记录),所以我必须以某种方式加快速度。
这是我们使用的程序之一:
DELIMITER //
CREATE PROCEDURE `upsert-debtors` (
internalCaseIDParam VARCHAR(50),
caseNumberParam VARCHAR(50),
bankruptcyCaseNumberParam VARCHAR(50),
bankruptcyChapterNumberParam VARCHAR(50),
bankruptcyDateFiledParam VARCHAR(50),
bankruptcyDischargeDateParam VARCHAR(50),
bankruptcyDismissalDateParam VARCHAR(50),
accountReceivedDateParam VARCHAR(50),
clientCommissionRateParam VARCHAR(50),
clientFullNameParam VARCHAR(50),
clientNameParam VARCHAR(50),
collectionStatusDateParam VARCHAR(50),
collectorUserNameParam VARCHAR(50),
creditorParam VARCHAR(50),
currentBalanceDueParam VARCHAR(50),
currentClaimStatusParam VARCHAR(50),
currentCostsParam VARCHAR(50),
currentFeesParam VARCHAR(50),
currentInterestParam VARCHAR(50),
currentPerDiemParam VARCHAR(50),
currentPrincipalParam VARCHAR(50),
dateEnteredInSimplicityParam VARCHAR(50),
debtorAkaParam VARCHAR(50),
debtorAddressOneParam VARCHAR(50),
debtorAddressTwoParam VARCHAR(50),
debtorAddressStatusParam VARCHAR(50),
debtorCellParam VARCHAR(50),
debtorCityParam VARCHAR(50),
debtorCompanyContactParam VARCHAR(50),
debtorCompanyNameParam VARCHAR(50),
debtorDobParam VARCHAR(50),
debtorEmailParam VARCHAR(50),
debtorFaxParam VARCHAR(50),
debtorFirstNameParam VARCHAR(50),
debtorFullNameParam VARCHAR(50),
debtorLastNameParam VARCHAR(50),
debtorLinkedBalanceParam VARCHAR(50),
debtorMiddleNameParam VARCHAR(50),
debtorOtherPhoneParam VARCHAR(50),
debtorPhoneParam VARCHAR(50),
debtorPrefixParam VARCHAR(50),
debtorSSNParam VARCHAR(50),
debtorStateParam VARCHAR(50),
debtorSuffixParam VARCHAR(50),
debtorZipParam VARCHAR(50),
docketNumberParam VARCHAR(50),
employerNameParam VARCHAR(50),
employerPhoneParam VARCHAR(50),
firstDelinquencyDateParam VARCHAR(50),
hotZipParam VARCHAR(50),
importNotesParam VARCHAR(250),
lastAccessDateParam VARCHAR(50),
lastDateReportedToEquifaxParam VARCHAR(50),
lastDateReportedToExperianParam VARCHAR(50),
lastDateReportedToTransunionParam VARCHAR(50),
lastPaymentAmountParam VARCHAR(50),
lastPaymentDateParam VARCHAR(50),
lastPaymentNettedAmountParam VARCHAR(50),
lastWorkDateParam VARCHAR(50),
neverReportToCreditBureausParam VARCHAR(50),
nextWorkDateParam VARCHAR(50),
oldSysAcctSParam VARCHAR(50),
isClosedParam VARCHAR(50),
originalClaimAmountParam VARCHAR(50),
originalClaimInterestRateParam VARCHAR(50),
originatedDateParam VARCHAR(50),
patientParam VARCHAR(50),
patientDobParam VARCHAR(50),
patientSsnParam VARCHAR(50),
payerParam VARCHAR(50),
paymentPlanAmount1Param VARCHAR(50),
paymentPlanAmount2Param VARCHAR(50),
paymentPlanAmount3Param VARCHAR(50),
paymentPlanAmount4Param VARCHAR(50),
paymentPlanAmount5Param VARCHAR(50),
paymentPlanAmount6Param VARCHAR(50),
nextPaymentPlanDueAmountParam VARCHAR(50),
paymentPlanDate1Param VARCHAR(50),
paymentPlanDate2Param VARCHAR(50),
paymentPlanDate3Param VARCHAR(50),
paymentPlanDate4Param VARCHAR(50),
paymentPlanDate5Param VARCHAR(50),
paymentPlanDate6Param VARCHAR(50),
nextPaymentPlanDueDateParam VARCHAR(50),
pmtPlanOverdueParam VARCHAR(50),
physicianParam VARCHAR(50),
pifCatalystParam VARCHAR(50),
referringAttorneyNameParam VARCHAR(50),
totalAttorneyFeesParam VARCHAR(50),
totalClaimAmountParam VARCHAR(50),
totalCostsParam VARCHAR(50),
totalInterestParam VARCHAR(50),
totalPaymentsParam VARCHAR(50),
totalPaymentsPreAndPostParam VARCHAR(50),
workflowBandParam VARCHAR(50),
clientClaimNumberParam VARCHAR(50)
)
BEGIN
UPDATE
debtors
SET
case_number = caseNumberParam,
bankruptcy_case_number = bankruptcyCaseNumberParam,
bankruptcy_chapter_number = bankruptcyChapterNumberParam,
bankruptcy_date_filed = bankruptcyDateFiledParam,
bankruptcy_discharge_date = bankruptcyDischargeDateParam,
bankruptcy_dismissal_date = bankruptcyDismissalDateParam,
account_received_date = accountReceivedDateParam,
client_commission_rate = clientCommissionRateParam,
client_full_name = clientFullNameParam,
client_name = clientNameParam,
collection_status_date = collectionStatusDateParam,
collector_user_name = collectorUserNameParam,
creditor = creditorParam,
current_balance_due = currentBalanceDueParam,
current_claim_status = currentClaimStatusParam,
current_costs = currentCostsParam,
current_fees = currentFeesParam,
current_interest = currentInterestParam,
current_per_diem = currentPerDiemParam,
current_principal = currentPrincipalParam,
date_entered_in_simplicity = dateEnteredInSimplicityParam,
debtor_aka = debtorAkaParam,
debtor_address_one = debtorAddressOneParam,
debtor_address_two = debtorAddressTwoParam,
debtor_address_status = debtorAddressStatusParam,
debtor_cell = debtorCellParam,
debtor_city = debtorCityParam,
debtor_company_contact = debtorCompanyContactParam,
debtor_company_name = debtorCompanyNameParam,
debtor_dob = debtorDobParam,
debtor_email = debtorEmailParam,
debtor_fax = debtorFaxParam,
debtor_first_name = debtorFirstNameParam,
debtor_full_name = debtorFullNameParam,
debtor_last_name = debtorLastNameParam,
debtor_linked_balance = debtorLinkedBalanceParam,
debtor_middle_name = debtorMiddleNameParam,
debtor_other_phone = debtorOtherPhoneParam,
debtor_phone = debtorPhoneParam,
debtor_prefix = debtorPrefixParam,
debtor_s_s_n = debtorSSNParam,
debtor_state = debtorStateParam,
debtor_suffix = debtorSuffixParam,
debtor_zip = debtorZipParam,
docket_number = docketNumberParam,
employer_name = employerNameParam,
employer_phone = employerPhoneParam,
first_delinquency_date = firstDelinquencyDateParam,
hot_zip = hotZipParam,
import_notes = importNotesParam,
last_access_date = lastAccessDateParam,
last_date_reported_to_equifax = lastDateReportedToEquifaxParam,
last_date_reported_to_experian = lastDateReportedToExperianParam,
last_date_reported_to_transunion = lastDateReportedToTransunionParam,
last_payment_amount = lastPaymentAmountParam,
last_payment_date = lastPaymentDateParam,
last_payment_netted_amount = lastPaymentNettedAmountParam,
last_work_date = lastWorkDateParam,
never_report_to_credit_bureaus = neverReportToCreditBureausParam,
next_work_date = nextWorkDateParam,
old_sys_acct_s = oldSysAcctSParam,
is_closed = isClosedParam,
original_claim_amount = originalClaimAmountParam,
original_claim_interest_rate = originalClaimInterestRateParam,
originated_date = originatedDateParam,
patient = patientParam,
patient_dob = patientDobParam,
patient_ssn = patientSsnParam,
payer = payerParam,
payment_plan_amount_1 = paymentPlanAmount1Param,
payment_plan_amount_2 = paymentPlanAmount2Param,
payment_plan_amount_3 = paymentPlanAmount3Param,
payment_plan_amount_4 = paymentPlanAmount4Param,
payment_plan_amount_5 = paymentPlanAmount5Param,
payment_plan_amount_6 = paymentPlanAmount6Param,
next_payment_plan_due_amount = nextPaymentPlanDueAmountParam,
payment_plan_date_1 = paymentPlanDate1Param,
payment_plan_date_2 = paymentPlanDate2Param,
payment_plan_date_3 = paymentPlanDate3Param,
payment_plan_date_4 = paymentPlanDate4Param,
payment_plan_date_5 = paymentPlanDate5Param,
payment_plan_date_6 = paymentPlanDate6Param,
next_payment_plan_due_date = nextPaymentPlanDueDateParam,
pmt_plan_overdue = pmtPlanOverdueParam,
physician = physicianParam,
pif_catalyst = pifCatalystParam,
referring_attorney_name = referringAttorneyNameParam,
total_attorney_fees = totalAttorneyFeesParam,
total_claim_amount = totalClaimAmountParam,
total_costs = totalCostsParam,
total_interest = totalInterestParam,
total_payments = totalPaymentsParam,
total_payments_pre_and_post = totalPaymentsPreAndPostParam,
workflow_band = workflowBandParam,
client_claim_number = clientClaimNumberParam
WHERE internal_case_id = internalCaseIDParam;
IF ROW_COUNT() = 0 THEN
INSERT INTO
debtors (
internal_case_id,
case_number,
bankruptcy_case_number,
bankruptcy_chapter_number,
bankruptcy_date_filed,
bankruptcy_discharge_date,
bankruptcy_dismissal_date,
account_received_date,
client_commission_rate,
client_full_name,
client_name,
collection_status_date,
collector_user_name,
creditor,
current_balance_due,
current_claim_status,
current_costs,
current_fees,
current_interest,
current_per_diem,
current_principal,
date_entered_in_simplicity,
debtor_aka,
debtor_address_one,
debtor_address_two,
debtor_address_status,
debtor_cell,
debtor_city,
debtor_company_contact,
debtor_company_name,
debtor_dob,
debtor_email,
debtor_fax,
debtor_first_name,
debtor_full_name,
debtor_last_name,
debtor_linked_balance,
debtor_middle_name,
debtor_other_phone,
debtor_phone,
debtor_prefix,
debtor_s_s_n,
debtor_state,
debtor_suffix,
debtor_zip,
docket_number,
employer_name,
employer_phone,
first_delinquency_date,
hot_zip,
import_notes,
last_access_date,
last_date_reported_to_equifax,
last_date_reported_to_experian,
last_date_reported_to_transunion,
last_payment_amount,
last_payment_date,
last_payment_netted_amount,
last_work_date,
never_report_to_credit_bureaus,
next_work_date,
old_sys_acct_s,
is_closed,
original_claim_amount,
original_claim_interest_rate,
originated_date,
patient,
patient_dob,
patient_ssn,
payer,
payment_plan_amount_1,
payment_plan_amount_2,
payment_plan_amount_3,
payment_plan_amount_4,
payment_plan_amount_5,
payment_plan_amount_6,
next_payment_plan_due_amount,
payment_plan_date_1,
payment_plan_date_2,
payment_plan_date_3,
payment_plan_date_4,
payment_plan_date_5,
payment_plan_date_6,
next_payment_plan_due_date,
pmt_plan_overdue,
physician,
pif_catalyst,
referring_attorney_name,
total_attorney_fees,
total_claim_amount,
total_costs,
total_interest,
total_payments,
total_payments_pre_and_post,
workflow_band,
client_claim_number
)
VALUES
(
internalCaseIDParam,
caseNumberParam,
bankruptcyCaseNumberParam,
bankruptcyChapterNumberParam,
bankruptcyDateFiledParam,
bankruptcyDischargeDateParam,
bankruptcyDismissalDateParam,
accountReceivedDateParam,
clientCommissionRateParam,
clientFullNameParam,
clientNameParam,
collectionStatusDateParam,
collectorUserNameParam,
creditorParam,
currentBalanceDueParam,
currentClaimStatusParam,
currentCostsParam,
currentFeesParam,
currentInterestParam,
currentPerDiemParam,
currentPrincipalParam,
dateEnteredInSimplicityParam,
debtorAkaParam,
debtorAddressOneParam,
debtorAddressTwoParam,
debtorAddressStatusParam,
debtorCellParam,
debtorCityParam,
debtorCompanyContactParam,
debtorCompanyNameParam,
debtorDobParam,
debtorEmailParam,
debtorFaxParam,
debtorFirstNameParam,
debtorFullNameParam,
debtorLastNameParam,
debtorLinkedBalanceParam,
debtorMiddleNameParam,
debtorOtherPhoneParam,
debtorPhoneParam,
debtorPrefixParam,
debtorSSNParam,
debtorStateParam,
debtorSuffixParam,
debtorZipParam,
docketNumberParam,
employerNameParam,
employerPhoneParam,
firstDelinquencyDateParam,
hotZipParam,
importNotesParam,
lastAccessDateParam,
lastDateReportedToEquifaxParam,
lastDateReportedToExperianParam,
lastDateReportedToTransunionParam,
lastPaymentAmountParam,
lastPaymentDateParam,
lastPaymentNettedAmountParam,
lastWorkDateParam,
neverReportToCreditBureausParam,
nextWorkDateParam,
oldSysAcctSParam,
isClosedParam,
originalClaimAmountParam,
originalClaimInterestRateParam,
originatedDateParam,
patientParam,
patientDobParam,
patientSsnParam,
payerParam,
paymentPlanAmount1Param,
paymentPlanAmount2Param,
paymentPlanAmount3Param,
paymentPlanAmount4Param,
paymentPlanAmount5Param,
paymentPlanAmount6Param,
nextPaymentPlanDueAmountParam,
paymentPlanDate1Param,
paymentPlanDate2Param,
paymentPlanDate3Param,
paymentPlanDate4Param,
paymentPlanDate5Param,
paymentPlanDate6Param,
nextPaymentPlanDueDateParam,
pmtPlanOverdueParam,
physicianParam,
pifCatalystParam,
referringAttorneyNameParam,
totalAttorneyFeesParam,
totalClaimAmountParam,
totalCostsParam,
totalInterestParam,
totalPaymentsParam,
totalPaymentsPreAndPostParam,
workflowBandParam,
clientClaimNumberParam
);
END IF;
END //
DELIMITER ;
就程序的设置或某些数据库设置而言,是否有任何会导致它像现在这样慢的东西?
谢谢!
看起来这个过程将为每个 internal_case_id 调用。如果您有数千个 internal_case_id,那么为每个调用此存储过程可能会比您将其转换为基于集合的更新语句更慢。
For example if you could pull all the records from the csv into a temp table then you can join that table to your main table for your update statement, and you can have a separate insert statement for all internal_case_id that do not exist in your target table, calling the stored procedure once to achieve both.
Showing the debtors table structure would be useful. Assuming internal_case_id isn't a unique or primary key you should make it become one.
As you are loading from a CSV file, a stored procedure for each entry and the code that runs through it isn't necessary as there is a LOAD DATA explicitly for this purpose.
Remove or un[]
LOCAL
depending on if the file is on the server or the client side.The
IGNORE
is the important part of the syntax as it uses the UNIQUE or PRIMARY KEY to not override existing internal_case_id entries in the same way as your stored procedure did.Databases are good at doing things en masse; they are slow at doing things one at a time.
Use IODKU:
To use IODKU, there needs to be a
UNIQUE
(orPRIMARY
) key so that it knows which row to test for the existence of.Be sure to use
VALUES(col)
in theUPDATE
part. (See the documentation.)You should seriously consider splitting that very-wide table into a few tables such as
bankruptcy
,debtor
,last_payment
, etc. In doing so, you will quickly discover that you often don't need to update all hundred columns.Also see what might reasonably be 'normalized'. Both of these will make the IODKU step more complex.
payment_plan
seems to have a limit of 6. And probably has most rows empty. It would be better to have the 6 as rows, not columns. A particular debtor would have 0 or more rows, not limited to 6.When you are running the update, does it interfere with other activities? If so, chunk it into groups of 100 rows at a time.
Suggestion to consider prior to your Weekly Update
to enable more in RAM storage of changes until weekly update is finished.
When done,
for default set aside.
Disclaimer: I am the content author of website mentioned in my profile, Networkprofile, additional suggestions and FREE Utility Scripts are available.