AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / dba / 问题 / 165762
Accepted
SQL_Deadwood
SQL_Deadwood
Asked: 2017-03-01 10:31:36 +0800 CST2017-03-01 10:31:36 +0800 CST 2017-03-01 10:31:36 +0800 CST

如何在 SQL Server 中将数据转换为正确的大小写

  • 772

SQL Server 包含用于查看/更新字符串数据为大写和小写但不是正确大小写的系统函数。有多种原因希望此操作发生在 SQL Server 中而不是在应用程序层中。就我而言,我们在整合来自多个来源的全球人力资源数据期间执行了一些数据清理。

如果您在互联网上搜索,您会发现此任务的多种解决方案,但许多似乎有限制性警告或不允许在函数中定义例外。

注意:正如下面评论中提到的,SQL Server 不是执行此转换的理想场所。还建议了其他方法 - 例如 CLR。在我看来,这篇文章已经达到了它的目的——将所有这些想法集中在一个地方真是太好了,而不是随处可见的随机花絮。谢谢你们。

sql-server t-sql
  • 3 3 个回答
  • 82467 Views

3 个回答

  • Voted
  1. billinkc
    2017-03-02T12:43:49+08:002017-03-02T12:43:49+08:00

    使用这些方法将遇到的挑战是您丢失了信息。向业务用户解释他们拍摄了一张模糊、失焦的照片,尽管他们在电视上看到了什么,但无法使其清晰且清晰。总会有这些规则不起作用的情况,只要每个人都知道这是这种情况,那就去做吧。

    这是 HR 数据,所以我假设我们正在讨论以一致的标题格式格式获取姓名,因为大型机将其存储为AARON BERTRAND,我们希望新系统不会对他们大喊大叫。亚伦很容易(但并不便宜)。您和 Hannah 已经确定了 Mc/Mac 的问题,因此它正确地大写了 Mc/Mac,但在某些情况下它对 Mackey/ Maclin /Mackenzie 过于激进。不过,Mackenzie 是一个有趣的案例 - 看看它作为婴儿名字的受欢迎程度如何

    麦肯齐

    在某个时候,会有一个可怜的孩子叫麦肯齐麦肯齐,因为人是可怕的存在。

    你也会遇到像 D'Antoni 这样可爱的东西,我们应该在刻度线周围加上两个字母。除了 d'Autremont,您只将撇号后的字母大写。但是,如果您将邮件发送给 d'Illoni,因为他们的姓氏是 D'illoni,天堂会帮助您。

    为了提供实际代码,以下是我们在 2005 实例中使用的 CLR 方法。它通常使用 ToTitleCase,除了我们构建的例外列表,这是我们基本上放弃尝试编写上述例外的时候。

    namespace Common.Util
    {
        using System;
        using System.Collections.Generic;
        using System.Globalization;
        using System.Text;
        using System.Text.RegularExpressions;
        using System.Threading;
        
        /// <summary>
        /// A class that attempts to proper case a word, taking into
        /// consideration some outliers.
        /// </summary>
        public class ProperCase
        {
            /// <summary>
            /// Convert a string into its propercased equivalent.  General case
            /// it will capitalize the first letter of each word.  Handled special 
            /// cases include names with apostrophes (O'Shea), and Scottish/Irish
            /// surnames MacInnes, McDonalds.  Will fail for Macbeth, Macaroni, etc
            /// </summary>
            /// <param name="inputText">The data to be recased into initial caps</param>
            /// <returns>The input text resampled as proper cased</returns>
            public static string Case(string inputText)
            {
                CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
                TextInfo textInfo = cultureInfo.TextInfo;
                string output = null;
                int staticHack = 0;
    
                Regex expression = null;
                string matchPattern = string.Empty;
    
                // Should think about maybe matching the first non blank character
                matchPattern = @"
                    (?<Apostrophe>'.\B)| # Match things like O'Shea so apostrophe plus one.  Think about white space between ' and next letter.  TODO:  Correct it's from becoming It'S, can't -> CaN'T
                    \bMac(?<Mac>.) | # MacInnes, MacGyver, etc.  Will fail for Macbeth
                    \bMc(?<Mc>.) # McDonalds
                    ";
                expression = new Regex(matchPattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase);
    
                // Handle our funky rules            
                // Using named matches is probably overkill as the
                // same rule applies to all but for future growth, I'm
                // defining it as such.
                // Quirky behaviour---for 2005, the compiler will 
                // make this into a static method which is verboten for 
                // safe assemblies.  
                MatchEvaluator upperCase = delegate(Match match)
                {
                    // Based on advice from Chris Hedgate's blog
                    // I need to reference a local variable to prevent
                    // this from being turned into static
                    staticHack = matchPattern.Length;
    
                    if (!string.IsNullOrEmpty(match.Groups["Apostrophe"].Value))
                    {
                        return match.Groups["Apostrophe"].Value.ToUpper();
                    }
    
                    if (!string.IsNullOrEmpty(match.Groups["Mac"].Value))
                    {
                        return string.Format("Mac{0}", match.Groups["Mac"].Value.ToUpper());
                    }
    
                    if (!string.IsNullOrEmpty(match.Groups["Mc"].Value))
                    {
                        return string.Format("Mc{0}", match.Groups["Mc"].Value.ToUpper());
                    }
    
                    return match.Value;
                };
    
                MatchEvaluator evaluator = new MatchEvaluator(upperCase);
    
                if (inputText != null)
                {
                    // Generally, title casing converts the first character 
                    // of a word to uppercase and the rest of the characters 
                    // to lowercase. However, a word that is entirely uppercase, 
                    // such as an acronym, is not converted.
                    // http://msdn.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase(VS.80).aspx
                    string temporary = string.Empty;
                    temporary = textInfo.ToTitleCase(inputText.ToString().ToLower());
                    output = expression.Replace(temporary, evaluator);
                }
                else
                {
                    output = string.Empty;
                }
    
                return output;
            }
        }
    }
    

    既然所有这些都清楚了,我要完成这本可爱的 ee cummings 诗集

    • 14
  2. Hannah Vernon
    2017-03-01T20:26:24+08:002017-03-01T20:26:24+08:00

    我意识到你已经有了一个很好的解决方案,但我想我会添加一个使用 Inline-Table-Valued-Function 的更简单的解决方案,尽管它依赖于使用即将推出的“vNext”版本的 SQL Server,其中包括STRING_AGG()和STRING_SPLIT()功能:

    IF OBJECT_ID('dbo.fn_TitleCase') IS NOT NULL
    DROP FUNCTION dbo.fn_TitleCase;
    GO
    CREATE FUNCTION dbo.fn_TitleCase
    (
        @Input nvarchar(1000)
    )
    RETURNS TABLE
    AS
    RETURN
    SELECT Item = STRING_AGG(splits.Word, ' ')
    FROM (
        SELECT Word = UPPER(LEFT(value, 1)) + LOWER(RIGHT(value, LEN(value) - 1))
        FROM STRING_SPLIT(@Input, ' ')
        ) splits(Word);
    GO
    

    测试功能:

    SELECT *
    FROM dbo.fn_TitleCase('this is a test');
    

    这是一个测验

    SELECT *
    FROM dbo.fn_TitleCase('THIS IS A TEST');
    

    这是一个测验

    有关STRING_AGG()和STRING_SPLIT()的文档,请参阅 MSDN

    请记住,该STRING_SPLIT()功能不保证以任何特定顺序返回项目。这可能是最烦人的。有一个 Microsoft 反馈项要求将一列添加到 STRING_SPLIT 的输出中以表示输出的顺序。考虑在这里投票

    如果您想生活在边缘,并且想使用这种方法,可以将其扩展为包括异常。我已经构建了一个内联表值函数,它就是这样做的:

    CREATE FUNCTION dbo.fn_TitleCase
    (
        @Input nvarchar(1000)
        , @SepList nvarchar(1)
    )
    RETURNS TABLE
    AS
    RETURN
    WITH Exceptions AS (
        SELECT v.ItemToFind
            , v.Replacement
        FROM (VALUES /* add further exceptions to the list below */
              ('mca', 'McA')
            , ('maca','MacA')
            ) v(ItemToFind, Replacement)
    )
    , Source AS (
        SELECT Word = UPPER(LEFT(value, 1 )) + LOWER(RIGHT(value, LEN(value) - 1))
            , Num = ROW_NUMBER() OVER (ORDER BY GETDATE())
        FROM STRING_SPLIT(@Input, @SepList) 
    )
    SELECT Item = STRING_AGG(splits.Word, @SepList)
    FROM (
        SELECT TOP 214748367 Word
        FROM (
            SELECT Word = REPLACE(Source.Word, Exceptions.ItemToFind, Exceptions.Replacement)
                , Source.Num
            FROM Source
            CROSS APPLY Exceptions
            WHERE Source.Word LIKE Exceptions.ItemToFind + '%'
            UNION ALL
            SELECT Word = Source.Word
                , Source.Num
            FROM Source
            WHERE NOT EXISTS (
                SELECT 1
                FROM Exceptions
                WHERE Source.Word LIKE Exceptions.ItemToFind + '%'
                )
            ) w
        ORDER BY Num
        ) splits;
    GO
    

    测试它显示了它是如何工作的:

    SELECT *
    FROM dbo.fn_TitleCase('THIS IS A TEST MCADAMS MACKENZIE MACADAMS', ' ');
    

    这是一个测试麦克亚当斯 Mackenzie MacAdams

    • 7
  3. Best Answer
    SQL_Deadwood
    2017-03-01T10:31:36+08:002017-03-01T10:31:36+08:00

    我遇到的最佳解决方案可以在这里找到。

    我稍微修改了脚本:我在返回值中添加了 LTRIM 和 RTRIM,因为在某些情况下,脚本会在值之后添加空格。

    预览从大写数据到正确大小写的转换的用法示例,但有以下例外:

    SELECT <column>,[dbo].[fProperCase](<column>,'|APT|HWY|BOX|',NULL)
    FROM <table> WHERE <column>=UPPER(<column>)
    

    该脚本真正简单而强大的方面是能够在函数调用本身中定义异常。

    但是需要注意的一点是:
    目前编写的脚本不能正确处理 Mc[AZ]%、Mac[AZ]% 等姓氏。我目前正在编辑以处理这种情况。

    作为一种解决方法,我更改了函数的返回参数:REPLACE(REPLACE(LTRIM(RTRIM((@ProperCaseText))),'Mcd','McD'),'Mci','McI') 等......

    这种方法显然需要对数据有预知,并不理想。我确信有办法解决这个问题,但我正处于转换过程中,目前没有时间专门解决这个令人讨厌的问题。

    这是代码:

    CREATE FUNCTION [dbo].[fProperCase](@Value varchar(8000), @Exceptions varchar(8000),@UCASEWordLength tinyint)
    returns varchar(8000)
    as
    /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Function Purpose: To convert text to Proper Case.
    Created By:             David Wiseman
    Website:                http://www.wisesoft.co.uk
    Created:                2005-10-03
    Updated:                2006-06-22
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    INPUTS:
    
    @Value :                This is the text to be converted to Proper Case
    @Exceptions:            A list of exceptions to the default Proper Case rules. e.g. |RAM|CPU|HDD|TFT|
                                  Without exception list they would display as Ram, Cpu, Hdd and Tft
                                  Note the use of the Pipe "|" symbol to separate exceptions.
                                  (You can change the @sep variable to something else if you prefer)
    @UCASEWordLength: You can specify that words less than a certain length are automatically displayed in UPPERCASE
    
    USAGE1:
    
    Convert text to ProperCase, without any exceptions
    
    select dbo.fProperCase('THIS FUNCTION WAS CREATED BY DAVID WISEMAN',null,null)
    >> This Function Was Created By David Wiseman
    
    USAGE2:
    
    Convert text to Proper Case, with exception for WiseSoft
    
    select dbo.fProperCase('THIS FUNCTION WAS CREATED BY DAVID WISEMAN @ WISESOFT','|WiseSoft|',null)
    >> This Function Was Created By David Wiseman @ WiseSoft
    
    USAGE3:
    
    Convert text to Proper Case and default words less than 3 chars to UPPERCASE
    
    select dbo.fProperCase('SIMPSON, HJ',null,3)
    >> Simpson, HJ
    
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */
    begin
          declare @sep char(1) -- Seperator character for exceptions
          declare @i int -- counter
          declare @ProperCaseText varchar(5000) -- Used to build our Proper Case string for Function return
          declare @Word varchar(1000) -- Temporary storage for each word
          declare @IsWhiteSpace as bit -- Used to indicate whitespace character/start of new word
          declare @c char(1) -- Temp storage location for each character
    
          set @Word = ''
          set @i = 1
          set @IsWhiteSpace = 1
          set @ProperCaseText = ''
          set @sep = '|'
    
          -- Set default UPPERCASEWord Length
          if @UCASEWordLength is null set @UCASEWordLength = 1
          -- Convert user input to lower case (This function will UPPERCASE words as required)
          set @Value = LOWER(@Value)
    
          -- Loop while counter is less than text lenth (for each character in...)
          while (@i <= len(@Value)+1)
          begin
    
                -- Get the current character
                set @c = SUBSTRING(@Value,@i,1)
    
                -- If start of new word, UPPERCASE character
                if @IsWhiteSpace = 1 set @c = UPPER(@c)
    
                -- Check if character is white space/symbol (using ascii values)
                set @IsWhiteSpace = case when (ASCII(@c) between 48 and 58) then 0
                                              when (ASCII(@c) between 64 and 90) then 0
                                              when (ASCII(@c) between 96 and 123) then 0
                                              else 1 end
    
                if @IsWhiteSpace = 0
                begin
                      -- Append character to temp @Word variable if not whitespace
                      set @Word = @Word + @c
                end
                else
                begin
                      -- Character is white space/punctuation/symbol which marks the end of our current word.
                      -- If word length is less than or equal to the UPPERCASE word length, convert to upper case.
                      -- e.g. you can specify a @UCASEWordLength of 3 to automatically UPPERCASE all 3 letter words.
                      set @Word = case when len(@Word) <= @UCASEWordLength then UPPER(@Word) else @Word end
    
                      -- Check word against user exceptions list. If exception is found, use the case specified in the exception.
                      -- e.g. WiseSoft, RAM, CPU.
                      -- If word isn't in user exceptions list, check for "known" exceptions.
                      set @Word = case when charindex(@sep + @Word + @sep,@exceptions collate Latin1_General_CI_AS) > 0
                                        then substring(@exceptions,charindex(@sep + @Word + @sep,@exceptions collate Latin1_General_CI_AS)+1,len(@Word))
                                        when @Word = 's' and substring(@Value,@i-2,1) = '''' then 's' -- e.g. Who's
                                        when @Word = 't' and substring(@Value,@i-2,1) = '''' then 't' -- e.g. Don't
                                        when @Word = 'm' and substring(@Value,@i-2,1) = '''' then 'm' -- e.g. I'm
                                        when @Word = 'll' and substring(@Value,@i-3,1) = '''' then 'll' -- e.g. He'll
                                        when @Word = 've' and substring(@Value,@i-3,1) = '''' then 've' -- e.g. Could've
                                        else @Word end
    
                      -- Append the word to the @ProperCaseText along with the whitespace character
                      set @ProperCaseText = @ProperCaseText + @Word + @c
                      -- Reset the Temp @Word variable, ready for a new word
                      set @Word = ''
                end
                -- Increment the counter
                set @i = @i + 1
          end
          return @ProperCaseText
    end
    
    • 2

相关问题

  • SQL Server - 使用聚集索引时如何存储数据页

  • 我需要为每种类型的查询使用单独的索引,还是一个多列索引可以工作?

  • 什么时候应该使用唯一约束而不是唯一索引?

  • 死锁的主要原因是什么,可以预防吗?

  • 如何确定是否需要或需要索引

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目

    • 12 个回答
  • Marko Smith

    如何让sqlplus的输出出现在一行中?

    • 3 个回答
  • Marko Smith

    选择具有最大日期或最晚日期的日期

    • 3 个回答
  • Marko Smith

    如何列出 PostgreSQL 中的所有模式?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Jin 连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane 如何列出 PostgreSQL 中的所有模式? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve