William提出的问题 -coding

William

Asked: 2024-11-27 07:00:19 +0800 CST

写入增量表 spark 3.5.3 delta lake 3.2.0

看来我无法使用 spark 作业中的 delta 格式进行写入，但我不确定我遗漏了什么。我正在使用 spark 3.5.3 和 deltalake 3.2.0。

我的错误：

Exception in thread "main" org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find the data source: delta. Please find packages at `https://spark.apache.org/third-party-projects.html`.

我的build.sbt：

name := "test"
version := "0.1"
scalaVersion := "2.12.18"
logLevel := Level.Warn
assembly / logLevel := Level.Warn
clean / logLevel := Level.Warn

libraryDependencies += "org.apache.spark" %% "spark-core" % "3.5.3" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.5.3" % "provided"
libraryDependencies += "io.delta" %% "delta-spark" % "3.2.0"

assembly / test := {}
assemblyJarName := s"${name.value}-${version.value}.jar"

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", _*) => MergeStrategy.discard
  case _                        => MergeStrategy.first
}

我的工作是：

val spark = SparkSession
      .builder()
      .appName("test")
      .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
      .config(
        "spark.sql.catalog.spark_catalog",
        "org.apache.spark.sql.delta.catalog.DeltaCatalog"
      )
      .getOrCreate()

val df = getData(spark)

val path = "/home/user/testtable"

df.write.format("delta").mode("overwrite").save(path)

spark.stop()

有什么想法吗？我正在查看 delta lake quickstart，但我没有发现任何遗漏的内容。不过我觉得有些东西很明显。

William

Asked: 2024-09-25 02:55:30 +0800 CST

对于跨平台网络代码来说，序列化浮点数是否必要？

我正在阅读有关网络编程的指南，我非常喜欢：https://beej.us/guide/bgnet/html/split/slightly-advanced-techniques.html#serialization

不过，我对一些事情感到困惑。在关于序列化的这一节中，他谈到了出于字节排序的原因而序列化整数，这对我来说很有意义，但他还包括这两个函数 pack754 和 unpack754，用于以 IEEE-754 格式序列化浮点数。

uint64_t pack754(long double f, unsigned bits, unsigned expbits)
{
    long double fnorm;
    int shift;
    long long sign, exp, significand;
    unsigned significandbits = bits - expbits - 1; // -1 for sign bit

    if (f == 0.0) return 0; // get this special case out of the way

    // check sign and begin normalization
    if (f < 0) { sign = 1; fnorm = -f; }
    else { sign = 0; fnorm = f; }

    // get the normalized form of f and track the exponent
    shift = 0;
    while(fnorm >= 2.0) { fnorm /= 2.0; shift++; }
    while(fnorm < 1.0) { fnorm *= 2.0; shift--; }
    fnorm = fnorm - 1.0;

    // calculate the binary form (non-float) of the significand data
    significand = fnorm * ((1LL<<significandbits) + 0.5f);

    // get the biased exponent
    exp = shift + ((1<<(expbits-1)) - 1); // shift + bias

    // return the final answer
    return (sign<<(bits-1)) | (exp<<(bits-expbits-1)) | significand;
}

long double unpack754(uint64_t i, unsigned bits, unsigned expbits)
{
    long double result;
    long long shift;
    unsigned bias;
    unsigned significandbits = bits - expbits - 1; // -1 for sign bit

    if (i == 0) return 0.0;

    // pull the significand
    result = (i&((1LL<<significandbits)-1)); // mask
    result /= (1LL<<significandbits); // convert back to float
    result += 1.0f; // add the one back on

    // deal with the exponent
    bias = (1<<(expbits-1)) - 1;
    shift = ((i>>significandbits)&((1LL<<expbits)-1)) - bias;
    while(shift > 0) { result *= 2.0; shift--; }
    while(shift < 0) { result /= 2.0; shift++; }

    // sign it
    result *= (i>>(bits-1))&1? -1.0: 1.0;

    return result;
}

我感到困惑的是，这些函数的工作原理是先查看符号的第一位，然后查看指数的下一个 X 位，再查看尾数的下一个 Y 位。那么，这是否意味着浮点数在主机上必须已经采用 IEEE-754 格式才能正常工作？

这只是为了解释格式，还是你在现实生活中实际会做的事情？

写入增量表 spark 3.5.3 delta lake 3.2.0

对于跨平台网络代码来说，序列化浮点数是否必要？

Vue 3：创建时出错“预期标识符但发现‘导入’”[重复]

为什么这个简单而小的 Java 代码在所有 Graal JVM 上的运行速度都快 30 倍，但在任何 Oracle JVM 上却不行？

具有指定基础类型但没有枚举器的“枚举类”的用途是什么？

如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

我正在尝试仅使用海龟随机和数学模块来制作吃豆人游戏

William's questions