关于【selenium】的问题- 第1页

Hooman Bahreini

Asked: 2020-10-30 12:24:31 +0800 CST

无法为我的 scrapy 程序创建 crontab 作业

8

我写了一个小的 Python 爬虫（使用 Scrapy 框架）。刮板需要无头浏览...我正在使用 ChromeDriver。

当我在没有任何 GUI 的 Ubuntu 服务器上运行此代码时，我必须安装 Xvfb 才能在我的 Ubuntu 服务器上运行 ChromeDriver（我遵循了本指南）

这是我的代码：

class MySpider(scrapy.Spider):
    name = 'my_spider'

    def __init__(self):
        # self.driver = webdriver.Chrome(ChromeDriverManager().install())
        chrome_options = Options()
        chrome_options.add_argument('--headless')
        chrome_options.add_argument('--no-sandbox')
        chrome_options.add_argument('--disable-dev-shm-usage')
        self.driver = webdriver.Chrome('/usr/bin/chromedriver', chrome_options=chrome_options)

我可以从 Ubuntu shell 运行上面的代码，它执行时没有任何错误：

ubuntu@ip-1-2-3-4:~/scrapers/my_scraper$ scrapy crawl my_spider

现在我想设置一个 cron 作业来每天运行上述命令：

# m h  dom mon dow   command
PATH=/usr/local/bin:/home/ubuntu/.local/bin/
05 12 * * * cd /home/ubuntu/scrapers/my_scraper && scrapy crawl my_spider >> /tmp/scraper.log 2>&1

但是 crontab 作业给了我以下错误：

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 192, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 196, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator
    return _cancellableInlineCallbacks(gen)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
    _inlineCallbacks(None, g, status)
--- <exception caught here> ---
  File "/home/ubuntu/.local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 86, in crawl
    self.spider = self._create_spider(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/scrapy/crawler.py", line 98, in _create_spider
    return self.spidercls.from_crawler(self, *args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/scrapy/spiders/__init__.py", line 19, in from_crawler
    spider = cls(*args, **kwargs)
  File "/home/ubuntu/scrapers/my_scraper/my_scraper/spiders/spider.py", line 27, in __init__
    self.driver = webdriver.Chrome('/usr/bin/chromedriver', chrome_options=chrome_options)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 81, in __init__
    desired_capabilities=desired_capabilities)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
    self.start_session(capabilities, browser_profile)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
  (unknown error: DevToolsActivePort file doesn't exist)
  (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
  (Driver info: chromedriver=2.41.578700 (2f1ed5f9343c13f73144538f15c00b370eda6706),platform=Linux 5.4.0-1029-aws x86_64)

更新

这个答案帮助我解决了这个问题（但我不太明白为什么）

我echo $PATH在我的 Ubuntu shell 上运行并将值复制到 crontab 中：

PATH=/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
05 12 * * * cd /home/ubuntu/scrapers/my_scraper && scrapy crawl my_spider >> /tmp/scraper.log 2>&1

注意：由于我为这个问题创建了一个赏金，我很高兴将它奖励给任何解释为什么更改 PATH 解决了问题的答案。

Akshat Zala

Asked: 2020-05-08 19:54:35 +0800 CST

如何在 ubuntu 中为 python 脚本启用 chromedriver？

1

对于计划任务，我希望使用 python 脚本执行 selenium webdriver。

我尝试了以下解决方案：

https://stackoverflow.com/questions/50117377/selenium-with-chromedriver-doesnt-start-via-cron

但它并没有解决问题。

色度驱动器的位置：

~/Documents/Python/Chromedriver/chromedriver

Prasanth Ganesan

Asked: 2020-01-01 05:00:43 +0800 CST

如何在单声道中将 dll 加载到 C# 并运行它

1

我有一些用 Windows 编写的 Selenium 测试用例。现在我想在 Ubuntu 中运行它们。我找到了一种使用此解决方案链接我的 C# 程序和 selenium dll 的方法，但出现以下错误：

Unhandled Exception: System.IO.FileNotFoundException:   
Could not load file or assembly 'WebDriver, Version=3.14.0.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies.   
File name: 'WebDriver, Version=3.14.0.0, Culture=neutral,   
PublicKeyToken=null' [ERROR] FATAL UNHANDLED EXCEPTION:  
System.IO.FileNotFoundException:  
Could not load file or assembly 'WebDriver, Version=3.14.0.0, Culture=neutral, PublicKeyToken=nullure=neutral, PublicKeyToken=null

C＃：

using System;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace SelTest {
    class TestOne {
        public static void Main(string[] args) {
            IWebDriver driver = new ChromeDriver();
            driver.Navigate().GoToUrl("https://www.google.com");
            driver.Quit();
        }
    }
}

用于编译和运行的命令：

mcs -r:/home/local/prasanth/test/dist/lib/net45/WebDriver.dll SelTest.cs; ./SelTest.exe

我的单声道版本：6.6.0.161

selenium 库是从这里下载的。

user124

Asked: 2019-09-17 02:33:50 +0800 CST

Python Selenium 导入错误

1

我正在尝试执行一些 python 项目，但出现以下错误：-

File "feed.py", line 17, in <module>
    from selenium import webdriver
ImportError: No module named selenium

我使用 pip 命令为 python2 和 python3 安装了 selenium 包。如果我在终端上运行 python，如果我尝试导入 selenium，我不会收到此错误。

但是我正在使用 venv 执行项目。我无法弄清楚它试图从哪里寻找硒包。

当我在终端上执行 python2.7/python/python 并运行 from selenium import webdriver时，我没有看到任何导入错误。但是我执行时的项目给了我错误。我无法找到正在查找硒的路径。我怎样才能找到它？

junichironakashima

Asked: 2019-07-24 19:14:31 +0800 CST

如何让python在进入下一行代码之前等待程序停止

3

我在 python 中创建了一个脚本，它将打开一个程序，然后 python 将等待该程序自行关闭，然后继续执行下一个代码。这是我的脚本：

Import subprocess as sp
sp.Popen([r'C:/Folder/folder/a.exe'])
??????
????????
print("test")

问号是我不知道的事情。

A-K

Asked: 2019-04-21 19:16:58 +0800 CST

在 VirtualBox 中试用 Ubuntu 和安装 Ubuntu 选项有什么区别？

4

“试用 Ubuntu”和“安装 Ubuntu”选项有什么区别？

我在我的 Windows 10 上安装了 VirtualBox，并在我的桌面上下载了 Ubuntu ISO。

我已经配置了 VirtualBox 并为 Virtualbox 提供了 Ubuntu ISO。我有两个选项——“试用 Ubuntu”和“安装 Ubuntu”。我不确定该采取哪个选项。

我的要求是我想并行运行我需要多台机器的 Selenium 脚本，因此使用 VM。

如果我选择安装 Ubuntu，它会改变我笔记本电脑的文件系统吗？我打算暂时使用VM来学习跨多台机器并行执行脚本的概念。学习后我想删除 VirtualBox 并且不想要 Ubuntu。

Thufir

Asked: 2018-12-27 11:10:00 +0800 CST

sudo apt install firefoxdriver 做什么？

1

使用Javafor ，对firefoxdriver包Selenium很好奇。有几个特定的包，但是否可能以一种易于基于项目找到的方式将其安装到系统中？python firefoxdrivergeckodriverJavaSelenium

我一直在使用属性文件：

<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>selenium config and vehicle type</comment>
    <entry key="gecko">/home/thufir/.gecko/geckodriver</entry>
    <entry key="url">http://books.toscrape.com/</entry>
    <entry key="driver">webdriver.gecko.driver</entry>
    <entry key="usr">admin</entry>
    <entry key="pwd">12345</entry>
    <entry key="option01">--headless</entry>
</properties>

如果它指向系统上的某个标准位置，它会稍微不那么脆弱。（我想也可以将它捆绑在JAR, 中。）

thufir@dur:~$ 
thufir@dur:~$ sudo apt install firefoxdriver
Reading package lists... Done
Building dependency tree       
Reading state information... Done
firefoxdriver is already the newest version (3.8.0-1).
0 upgraded, 0 newly installed, 0 to remove and 98 not upgraded.
thufir@dur:~$

kimbo

Asked: 2018-06-30 09:50:46 +0800 CST

当我 ctrl + 单击 gnome 终端中的链接时如何停止收到错误消息和两个选项卡

0

我在 Ubuntu 16.04 上。我使用 google-chrome 作为我的默认浏览器。google-chrome当我使用或从终端打开它时效果很好google-chrome-stable。当我在 pycharm 中运行我的测试服务器并单击链接时，它也很有效 - 立即打开。

但是，当我从终端单击任何链接时，它会在另一个窗口中打开 chrome（即使我已经打开它）并给我两个选项卡。

第一个地址是http://extension%3D/tmp/.org.chromium.Chromium.KmLkmq/internal，上面写着无法访问此站点。找不到扩展 %3D 的服务器 IP 地址。

第二个地址是data:，它只是一个空白页。

它们的顶部都有一个小栏，上面写着“Chrome 正在由自动化测试软件控制”。

这发生在不久前，我相信那是在我使用 selenium 和 chromedriver 打开和读取某些网页的数据时，主要是在修补。我猜有些东西卡住了，现在它会永远这样做。从那以后我尝试了一些东西，但似乎没有任何效果。

如果我将默认浏览器更改为 firefox，它仍然会打开这两个奇怪的页面。

我应该尝试重新安装 chrome 还是有人有其他解决方案？

这是一个屏幕截图

shubham bansal

Asked: 2018-05-30 03:04:58 +0800 CST

我想在我的 Ubuntu 16.04 系统中为 python 安装 selenium webdriver

3

当我安装 Selenium 时，出现以下错误：

Shubham@Shubham-To-be-filled-by-O-E-M:~$ sudo apt-get update
    Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [107 kB]
    Hit:2 https://repo.skype.com/deb stable InRelease                       
    Hit:3 http://in.archive.ubuntu.com/ubuntu xenial InRelease                     
    Get:4 http://in.archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
    Get:5 http://in.archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]  
    Fetched 323 kB in 8s (38.6 kB/s)                                               
    Reading package lists... Done
    Shubham@Shubham-To-be-filled-by-O-E-M:~$ sudo pip install selenium
    Traceback (most recent call last):
      File "/usr/bin/pip", line 9, in <module>
        from pip import main
    ImportError: cannot import name main

我应该如何进行？

无法为我的 scrapy 程序创建 crontab 作业

更新

如何在 ubuntu 中为 python 脚本启用 chromedriver？

如何在单声道中将 dll 加载到 C# 并运行它

Python Selenium 导入错误

如何让python在进入下一行代码之前等待程序停止

在 VirtualBox 中试用 Ubuntu 和安装 Ubuntu 选项有什么区别？

sudo apt install firefoxdriver 做什么？

当我 ctrl + 单击 gnome 终端中的链接时如何停止收到错误消息和两个选项卡

我想在我的 Ubuntu 16.04 系统中为 python 安装 selenium webdriver

如何运行 .sh 脚本？

如何安装 .tar.gz（或 .tar.bz2）文件？

如何列出所有已安装的软件包

无法锁定管理目录 (/var/lib/dpkg/) 是另一个进程在使用它吗？

问题[selenium](ubuntu)

更新