我正在尝试从此表访问 url 数据(可点击标题)。脚本正确获取了第一页,但我找不到从第二页获取数据的方法。以下是示例脚本:
function scrapeTitlesData() {
var url = "https://notices.philgeps.gov.ph/GEPSNONPILOT/Tender/SplashOpportunitiesSearchUI.aspx?menuIndex=3&BusCatID=53&type=category&ClickFrom=OpenOpp";
var response = UrlFetchApp.fetch(url).getContentText();
extractAndModifyUrls(response)
}
function extractAndModifyUrls(html) {
// Regex pattern to match the required URLs
var regex = /SplashBidNoticeAbstractUI\.aspx\?menuIndex=3&refID=\d+&[^"]+/g;
// Initialize an empty array to store the modified URLs
var modifiedUrls = [];
var match;
// Find all matches in the HTML
while ((match = regex.exec(html)) !== null) {
var newUrl = 'https://notices.philgeps.gov.ph/GEPSNONPILOT/Tender/' + match[0];
newUrl = newUrl.replace(/&/g, '&');
// Add the new URL to the array
modifiedUrls.push(newUrl);
}
console.log(modifiedUrls);
console.log(modifiedUrls.length)
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
sheet.getRange(1, 1, modifiedUrls.length, 1).setValues(modifiedUrls.map(function(url) { return [url]; }));
}
我尝试在 URL 中添加页码,但根本无法转到下一页。如能提供任何解决此问题的指导,我将不胜感激。
您需要
POST
使用表单数据和 cookie 发出请求。类似这样的操作应该可行:注意:
getNextPage
是一个生成器函数。(每次产生一页)