关于【google-cloud-functions】的问题- 第1页

Vojtěch

Asked: 2022-03-31 20:52:31 +0800 CST

从 VPC 网络中的 Cloudfunctions 解析 DNS

1

我正在使用 VPC 网络部署 Cloudfunction，如下所示：

gcloud beta functions deploy my-function
      --trigger-http
      --region europe-west1
      --memory 128MB
      --runtime nodejs16
      --entry-point entrypoint
      --allow-unauthenticated
      # needed to access compute instances
      # https://console.cloud.google.com/networking/connectors/list
      --vpc-connector cloud-function-connector
      # vpc connector should be used only to access private network
      --egress-settings private-ranges-only

现在，如果我的 Cloudfunction 使用计算资源的 IP 地址，我可以轻松访问它们。但是，当我使用他们的主机名时，DNS 无法解析，最终结果为：

Error: getaddrinfo ENOTFOUND my-compute-resource

我需要做什么才能将 DNS 用于我的计算实例？

questionto42standswithUkraine

Asked: 2022-02-10 03:24:19 +0800 CST

在 GCP 中创建基于日志的指标 (LBM) 所需的 Python 谷歌云函数日志记录的 jsonPayload（结构化日志记录）输出

3

我需要jsonPayload在 Google Cloud Function 的日志中，而不是textPayload. 我的目标是使用字典的键作为基于日志的度量的标签（请参阅基于日志的度量标签），以便可以在 Grafana 中解决这些问题。

我正在使用 Python 的logging模块，但如果需要，我也可以切换到其他模块。

我需要在日志中作为输出：

jsonPayload: `{'key1':value1, 'key2':value2}`

但是我得到了一个textPayload输出，整个下一行是一个字符串：

"2022-02-08 15:43:32,460 [INFO]: {"key1": value1, "key2": value2}"

日志中的真实示例，在中间，您会看到textPayload：

图片作为文字：

{
insertId: "000000-1b431ffd-e42d-4f83-xyz"
labels: {1}
logName: "projects/MY_PROJECT/logs/cloudfunctions.googleapis.com%2Fcloud-functions"
receiveTimestamp: "2022-02-08T15:43:41.808217166Z"
resource: {2}
textPayload: "2022-02-08 15:43:32,460 [INFO]: {"json_metadata": {"countrows": 736203, "countcolumns": 6, "size": 48261360, "gcs_stamp": "2022-02-08 15:43:32.451000+00:00", "python_stamp": "2022-02-08 15:43:31.055538"}}"
timestamp: "2022-02-08T15:43:32.460Z"
trace: "projects/MY_PROJECT/traces/dd97759176248586a3d3xyz"
}

第一次尝试

从https://cloud.google.com/logging/docs/structured-logging阅读：

在 Cloud Logging 中，结构化日志是指使用 jsonPayload 字段向其负载添加结构的日志条目。结构化日志记录适用于用户编写的日志。

在编写结构化日志之后，我尝试获取此“结构化日志记录”

logging.info(json.dumps(json_for_gcp_lbm))

但无济于事。

进一步在链接中：有一个来自 GCP 的内置 Logging 代理，它使用fluentd 关于 Logging 代理似乎仅适用于 Google Kubernetes Engine 或 App Engine，而不是 Google Cloud Function：

如果您使用的是 Google Kubernetes Engine 或 App Engine 柔性环境，则可以将结构化日志作为 JSON 对象写入单行序列化到 stdout 或 stderr。然后，Logging 代理将结构化日志作为 LogEntry 结构的 jsonPayload 发送到 Cloud Logging。

我怎样才能得到jsonPayload这个输出？

questionto42standswithUkraine

Asked: 2022-02-04 15:23:33 +0800 CST

谷歌云功能警告“OpenBLAS 警告 - 无法确定此系统上的 L2 缓存大小”

0

我收到警告

OpenBLAS 警告 - 无法确定此系统上的 L2 缓存大小，假设为 256k

在 Cloud Function 的“LOGS”选项卡的日志中。

在AppEngine 警告 - OpenBLAS WARNING - could not determine the L2 cache size on this system上已经有关于 Stack Overflow 的 Q/A ，但它不要求 Google Cloud 功能，仅适用于 Google App Engine。

我想知道如何在 Google Cloud Function 中消除此警告，以及我是否应该关心？

无论如何这只是一个警告，而且相当繁重的云函数（有大量计算 + 节省 50 MB csv、700 MB 所需 RAM、1 GB 分配、540 秒超时时间）无论如何都会运行。什么都不做可能是公认的答案。
也许我可以在设置中的某处找到 L2 缓存大小，这里指的是哪个系统？

questionto42standswithUkraine

Asked: 2022-02-01 09:32:39 +0800 CST

将 csv 从 CF 写入存储桶时：'with open(filepath, "w") as MY_CSV:' 导致 "FileNotFoundError: [Errno 2] No such file or directory:"

1

FileNotFoundError: [Errno 2] No such file or directory当我尝试使用循环数据批次的 csv 写入器将 csv 文件写入存储桶时，出现此错误。围绕该错误对 Cloud Function 日志的完整洞察：


File "/workspace/main.py", line 299, in write_to_csv_file with
open(filepath, "w") as outcsv: FileNotFoundError: [Errno 2] No such
file or directory: 'gs://MY_BUCKET/MY_CSV.csv'

Function execution took 52655 ms, finished with status: 'crash' 

OpenBLAS WARNING - could not determine the L2 cache size on this
system, assuming 256k  ```

而且，虽然这个 bucket_filepath 肯定存在：我可以上传一个空的虚拟文件并获取它的“gsutils URI”（右键单击文件右侧的三个点），并且 bucket_filepath 看起来相同：'gs://MY_BUCKET/MY_CSV.csv'.

我检查了保存一个虚拟的熊猫数据框，而不是使用pd.to_csv它，它使用相同的 bucket_filepath （！）。

因此，必须有另一个原因，可能是作者不被接受，或者with statement打开文件。

引发错误的代码如下。它与在本地服务器上的正常 cron 作业中在 Google Cloud Function 之外工作的代码相同。我在抛出错误的行周围添加了两个调试打印，print("Right after opening the file ...")不再显示。还显示了为每个批次调用的子函数query_execute_batch()，write_to_csv_file()但这里可能不是问题，因为在写入打开 csv 文件时，错误已经在一开始就发生了。

requirements.txt（然后作为模块导入）：

SQLAlchemy>=1.4.2
google-cloud-storage>=1.16.1
mysqlclient==2.1.0
pandas==1.2.3
fsspec==2021.11.1
gcsfs==2021.11.1
unicodecsv==0.14.1

从main.py：

def query_execute_batch(connection):
    """Function for reading data from the query result into batches
    :yield: each result in a loop is a batch of the query result
    """
    results = execute_select_batch(connection, SQL_QUERY)
    print(f"len(results): {len(results)}")
    for result in results:
        yield result

def write_to_csv_file(connection, filepath):
    """Write the data in a loop over batches into a csv.
    This is done in batches since the query from the database is huge.
    :param connection: mysqldb connection to DB
    :param filepath: path to csv file to write data
    returns: metadata on rows and time
    """
    countrows = 0
    print("Right before opening the file ...")    
    with open(filepath, "w") as outcsv:
        print("Right after opening the file ...")        
        writer = csv.DictWriter(
            outcsv,
            fieldnames=FIELDNAMES,
            extrasaction="ignore",
            delimiter="|",
            lineterminator="\n",
        )
        # write header according to fieldnames
        writer.writeheader()

        for batch in query_execute_batch(connection):
            writer.writerows(batch)
            countrows += len(batch)
        datetime_now_save = datetime.now()
    return countrows, datetime_now_save

请注意，为了使上述脚本正常工作，我导入gcsfs了这使得存储桶可读写。否则我可能需要一个谷歌云存储对象，例如：

storage_client = storage.Client()
bucket = storage_client.bucket(BUCKET_NAME)

然后使该存储桶中的文件具有更多功能，但这不是这里的目的。

在下面的pd.to_csv代码中，它使用虚拟 SQL 查询的输出SELECT 1作为数据帧的输入。这可以保存到同一个bucket_filepath，当然原因可能不仅仅是pd.to_csv()这样，而且数据集是一个虚拟的，而不是来自一个巨大的SELECT query. 或者还有其他原因，我只是猜测。

if records is not None:
    df = pd.DataFrame(records.fetchall())
    df.columns = records.keys()
    df.to_csv(filepath,
        index=False,
    )
    datetime_now_save = datetime.now()
    countrows = df.shape[0]

我想使用 csv 编写器有机会使用 unicodecsv 模块编写 unicode 并有机会使用批处理。

我可能愿意更改为 pandas 中的批处理（loop + appendmode 或chunksize），例如将大型 Pandas Dataframes to CSV file in chunks以摆脱此存储桶文件路径问题，但我宁愿使用现成的代码（切勿触摸正在运行的系统）。

如何使用 csv 编写器完成该 csv 的保存，以便它可以在write模式 =的存储桶中打开一个新文件with open(filepath, "w") as outcsv:？

给定的函数write_to_csv_file()只是云函数的一小部分，它使用了广泛的函数和级联函数。我不能在这里展示整个可重现的案例，希望可以通过经验或更简单的例子来回答。

questionto42standswithUkraine

Asked: 2022-01-21 02:51:20 +0800 CST

仅将 Google Cloud Storage 存储桶中文件的元数据读取到 Python 中的 Cloud Function 中（不加载文件或其数据！）

0

我需要Cloud Storage for Firebase 之类的东西：下载所有文件的元数据，只是不是在 Angular 中，而是在 Python 中，只是为了选择的文件。

目的是在云函数完成语句时返回此信息，return或者在文件保存在 Google 存储桶中后立即在云函数运行期间记录它。有了这些信息，可以在给定的时间戳之后开始另一个作业。管道是同步的。

我发现了关于将文件或其数据加载到云函数中的 Q/A

使用/tmp目录，例如通过 Cloud Functions 从云存储读取数据
或使用 storage.Client() 或 pandas df.read_csv() 加载数据（而不是文件），例如如何将文件从谷歌云存储加载到谷歌云功能

从外部文件中将数据统计信息提取到正在运行的 Cloud Function 中。

由于我不想随时将大文件或其数据保存在内存中以获取一些元数据，因此我只想从存储在 Google Storage 存储桶中的该文件中下载元数据，即时间戳和大小。

如何仅将 Google Cloud Storage 存储桶中的 csv 文件的元数据提取到 Google Cloud Function？

从 VPC 网络中的 Cloudfunctions 解析 DNS

在 GCP 中创建基于日志的指标 (LBM) 所需的 Python 谷歌云函数日志记录的 jsonPayload（结构化日志记录）输出

第一次尝试

谷歌云功能警告“OpenBLAS 警告 - 无法确定此系统上的 L2 缓存大小”

将 csv 从 CF 写入存储桶时：'with open(filepath, "w") as MY_CSV:' 导致 "FileNotFoundError: [Errno 2] No such file or directory:"

仅将 Google Cloud Storage 存储桶中文件的元数据读取到 Python 中的 Cloud Function 中（不加载文件或其数据！）

新安装后 postgres 的默认超级用户用户名/密码是什么？

SFTP 使用什么端口？

命令行列出 Windows Active Directory 组中的用户？

什么是 Pem 文件，它与其他 OpenSSL 生成的密钥文件格式有何不同？

如何确定bash变量是否为空？

问题[google-cloud-functions](server)

第一次尝试