我们的 DBA 在 HDFS/HIVE 中为我们的团队创建了一个模式。不确定“模式”是否是正确的词,他们称其为“组”。无论如何,我们只能写入这个模式内的数据湖,无论是 parquet 文件还是 hive 表。有没有办法检查分配给我们组的最大空间是多少,只知道模式名称?我不想意外加载太多数据。
谢谢你。
我看到 HBase 中提到了空间函数。例如“HBaseSpatial:基于 HBase 的可扩展空间数据存储”。
HBase 支持哪些空间功能,这在何处记录?
如果最终目标是使用 Spark 执行计算,那么首先将 Postgres 数据传输到 HDFS(使用 Sqoop)而不是直接使用带有 Postgres 的 Spark SQL(使用 JDBC?)的原因是什么?
这个问题的答案(引用 MongoDB,而不是 PostgreSQL ......但仍然适用)提到这是两个选项,但我想知道是什么促使选择一个而不是另一个。
我最近才尝试使用 SQL Server 2016。因此,如果我的假设不正确,请纠正我:从对SQL Server R Services
的
一些研究中,我发现RxHDFSConnect和RxHDFSFileSystem函数有助于将数据从 Hadoop 直接加载到 SQL Server 2016 数据库中。
我正在尝试使用 Sqoop 连接到 Teradat
使用以下命令:
sqoop import -libjars /usr/lib/sqoop/lib/tdgssconfig.jar,/usr/lib/sqoop/lib/terajdbc4.jar -driver com.teradata.jdbc.TeraDriver --connec
t "jdbc:teradata://<IP>;databaseName=<DB name>;user=<user>;password=<password>" --^Cble FACT -m 1 --target-dir /user/hduser/sqoop_trials/mangesh_test
我收到以下错误
2014-09-15.15:38:48.636 TERAJDBC4 ERROR [main] com.teradata.jdbc.jdbc_4.TDSession@1b3f1bfb Connection to <IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password>Mon Sep 15 15:38:48 IST 2014 socket orig=<IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password>cid=391cde0 sess=0 java.net.UnknownHostException: <IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password> at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getAllByName0(InetAddress.java:1246) at java.net.InetAddress.getAllByName(InetAddress.java:1162) at java.net.InetAddress.getAllByName(InetAddress.java:1098) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF$Lookup.<init>(TDNetworkIOIF.java:174) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.connectToHost(TDNetworkIOIF.java:273) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.<init>(TDNetworkIOIF.java:108) at com.teradata.jdbc.jdbc_4.TDSession.getIO(TDSession.java:582) at com.teradata.jdbc.jdbc.GenericStateController.<init>(GenericStateController.java:41) at com.teradata.jdbc.jdbc.GenericLogonController.<init>(GenericLogonController.java:40) at com.teradata.jdbc.jdbc_4.TDSession.<init>(TDSession.java:200) at com.teradata.jdbc.jdbc_3.ifjdbc_4.TeraLocalConnection.<init>(TeraLocalConnection.java:99) at com.teradata.jdbc.jdbc.ConnectionFactory.createConnection(ConnectionFactory.java:58) at com.teradata.jdbc.TeraDriver.doConnect(TeraDriver.java:218) at com.teradata.jdbc.TeraDriver.connect(TeraDriver.java:151) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:824) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:685) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:708) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:243) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:347) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1298) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1110) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
14/09/15 15:38:48 ERROR manager.SqlManager: Error executing statement: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata JDBC Driver] [TeraJDBC 13.00.00.33] [Error 1000] [SQLState 08S01] Login failure for Connection to <IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password>Mon Sep 15 15:38:48 IST 2014 socket orig=<IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password>cid=391cde0 sess=0 java.net.UnknownHostException: <IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password> at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getAllByName0(InetAddress.java:1246) at java.net.InetAddress.getAllByName(InetAddress.java:1162) at java.net.InetAddress.getAllByName(InetAddress.java:1098) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF$Lookup.<init>(TDNetworkIOIF.java:174) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.connectToHost(TDNetworkIOIF.java:273) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.<init>(TDNetworkIOIF.java:108) at com.teradata.jdbc.jdbc_4.TDSession.getIO(TDSession.java:582) at com.teradata.jdbc.jdbc.GenericStateController.<init>(GenericStateController.java:41) at com.teradata.jdbc.jdbc.GenericLogonController.<init>(GenericLogonController.java:40) at com.teradata.jdbc.jdbc_4.TDSession.<init>(TDSession.java:200) at com.teradata.jdbc.jdbc_3.ifjdbc_4.TeraLocalConnection.<init>(TeraLocalConnection.java:99) at com.teradata.jdbc.jdbc.ConnectionFactory.createConnection(ConnectionFactory.java:58) at com.teradata.jdbc.TeraDriver.doConnect(TeraDriver.java:218) at com.teradata.jdbc.TeraDriver.connect(TeraDriver.java:151) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:824) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:685) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:708) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:243) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:347) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1298) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1110) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata JDBC Driver] [TeraJDBC 13.00.00.33] [Error 1000] [SQLState 08S01] Login failure for Connection to <IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password>Mon Sep 15 15:38:48 IST 2014 socket orig=<IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password>cid=391cde0 sess=0 java.net.UnknownHostException: <IP addr>;databaseName=<Database_name>;user=<user_name>;password=<password> at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getAllByName0(InetAddress.java:1246) at java.net.InetAddress.getAllByName(InetAddress.java:1162) at java.net.InetAddress.getAllByName(InetAddress.java:1098) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF$Lookup.<init>(TDNetworkIOIF.java:174) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.connectToHost(TDNetworkIOIF.java:273) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.<init>(TDNetworkIOIF.java:108) at com.teradata.jdbc.jdbc_4.TDSession.getIO(TDSession.java:582) at com.teradata.jdbc.jdbc.GenericStateController.<init>(GenericStateController.java:41) at com.teradata.jdbc.jdbc.GenericLogonController.<init>(GenericLogonController.java:40) at com.teradata.jdbc.jdbc_4.TDSession.<init>(TDSession.java:200) at com.teradata.jdbc.jdbc_3.ifjdbc_4.TeraLocalConnection.<init>(TeraLocalConnection.java:99) at com.teradata.jdbc.jdbc.ConnectionFactory.createConnection(ConnectionFactory.java:58) at com.teradata.jdbc.TeraDriver.doConnect(TeraDriver.java:218) at com.teradata.jdbc.TeraDriver.connect(TeraDriver.java:151) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:824) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:685) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:708) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:243) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:347) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1298) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1110) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
早些时候我使用以下命令列出表
sqoop list-tables --connect "jdbc:teradata://<IP>;databaseName=<DB name>;user=<user>;password=<password>"
并且低于错误:
14/09/15 16:10:27 ERROR tool.BaseSqoopTool: Got error creating database manager: java.io.IOException: No manager for connect string: jdbc:teradata://<IP>;databaseName=<DB name>;user=<user name>;password=<password>
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:185)
at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:243)
at org.apache.sqoop.tool.ListTablesTool.run(ListTablesTool.java:44)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240
有人可以建议我哪里出错了吗?