我有来自医院的病人死亡数据。数据排列不正确,数据如下链接
所有日期应在 DOA(H 列)或 DOD(I 列)或 MB(J 列)列中,其余文本应排列在单独的列中。任何人都可以帮我清理这些数据,因为我有超过 5000 次观察。 在此处输入链接描述
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| Sl.NO | District | State P No | Age In Years | Sex | Symptoms | Co-Morbidities | DOA | DOD | MB Date | Notes |
+=======+===========+=============+===============+=========+===================+====================================================+==================+=================================+============+=======================+
| 10 | X4 | 6553 | 53 | F | Fever | Cold | Cough | Thyroid disease | 10-06-2020 | 20-06-2020 |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| 11 | X5 | 8872 | 62 | M | Fever | Diabetes Mellitus | 16-06-2020 | 16-06-2020 | 21-06-2020 | |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| 12 | X5 | 8880 | 55 | M | Pneumonia | Respiratory distress Obese, Chronic Alcoholic | 18-06-2020 | 20-06-2020 | 21-06-2020 | |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| 13 | X2 | 9149 | 70 | M | Loss of Appetite | Weakness, Hypertension | 18-06-2020 | 18-06-2020 | 21-06-2020 | |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| 14 | X3 | 9150 | 46 | M | Weakness | Convulsions, Hypertension | 17-06-2020 | 18-06-2020 | 21-06-2020 | |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| 15 | X4 | 7732 | 60 | Female | Fever | Cough | Breathlessness | uncontrolled Diabetes Mellitus | 17-06-2020 | 22-06-2020 |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
| 16 | X5 | 9237 | 90 | M | Asymptomatic | Hypertension | | 20-06-2020 | 22-06-2020 | Died at his residence |
+-------+-----------+-------------+---------------+---------+-------------------+----------------------------------------------------+------------------+---------------------------------+------------+-----------------------+
您可以使用
Power Query
Windows Excel 2010+ 和 Office 365 Excel中的 获得所需的输出Data => Get&Transform => From Table/Range
Home => Advanced Editor
Applied Steps
窗口,以更好地理解算法和步骤根据您的数据样本,我假设出现错误时,是因为逗号分隔了三种合并症。
既然如此,我已经测试了该
DOD
列是否包含日期。如果该逻辑不适用于所有人,或者它有错误,则可以轻松更改。
所有的处理魔法都发生在
Table.Group
函数的参数中M代码
结果
分阶段进行(因为您将不得不检查这一点)!这主要是因为您的示例仅显示 2 个日期,并且有可能缺少数据。好像它不只是修复电子表格那么简单。
原始数据在 A2:k8 中。DOA 列 H 和 DOD 列 I。
真:日期很好。错误:没有日期。数据输入错误。
如果正确则使用现有数据,否则将不适当的数据移至适当的列。
做好备份。输入公式。检查,检查,检查。转换为值并删除坏列。