我写了一个爬虫去访问内部网站,但该网站需要认证。我有用户名和密码。 chrome 中访问该网站的 request header 如下:
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8 Accept-Encoding:gzip, deflate, sdch Accept-Language:zh-CN,zh;q=0.8 Authorization:Negotiate TlRMTVNTUAADAAAAGAAYAKIAAAB0AXQBugAAAAAAAABYAAAALAAsAFgAAAAeAB4AhAAAABAAEAAuAgAAFYKI4goA1zoAAAAPFLRu37vbFb4klUR5WAcB+EQAbwBuAGcAWQBhAG4AZwAuAEwAaQBAAGMAbgAuAGEAYgBiAC4AYwBvAG0ATABBAFAAVABPAFAALQBPAFUATQBLAE4AMgAzAEIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+wmUlh9FaJ3tGg0Th7GiCgEBAAAAAAAASF38+WMs0wGC8zgpW5kZqgAAAAACABYAQQBTAEkAQQBQAEEAQwBJAEYASQBDAAEAGABDAE4ALQBTAC0AVgBUAE4ARQBUADQAMAAEACYAYQBzAGkAYQBwAGEAYwBpAGYAaQBjAC4AYQBiAGIALgBjAG8AbQADAC4AQwBOAC0AUwAtAFYAVABOAEUAVAA0ADAALgBjAG4ALgBhAGIAYgAuAGMAbwBtAAUADgBhAGIAYgAuAGMAbwBtAAcACABIXfz5YyzTAQYABAACAAAACAAwADAAAAAAAAAAAQAAAAAgAABfNwvufQg0a7aFWZeALWT6tH/s3Hk2qvpGhYUj2QHPkgoAEAAAAAAAAAAAAAAAAAAAAAAACQA4AEgAVABUAFAALwBDAE4ALQBTAC0AVgBUAE4ARQBUADQAMAAuAGMAbgAuAGEAYgBiAC4AYwBvAG0AAAAAAAAAAAAAAAAAWKVX6G9UgMg+FrNergcxIg== Cache-Control:no-cache Connection:keep-alive Cookie:ASP.NET_SessionId=otfoqdns33ntnm3gioypm0xh; _ga=GA1.2.548323792.1505115839; _gid=GA1.2.1779367607.1505115839 Pragma:no-cache Upgrade-Insecure-Requests:1 User-Agent:Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96 Safari/537.36 我用 request headers 去模拟这个访问:
var options = { url: ‘http://xxxxxxxxxxxxx’, headers: { ‘User-Agent’: ‘request’, ‘Accept’:‘text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8’, ‘Authorization’:‘Negotiate TlRMTVNTUAADAAAAGAAYAKIAAAB0AXQBugAAAAAAAABYAAAALAAsAFgAAAAeAB4AhAAAABAAEAAuAgAAFYKI4goA1zoAAAAPFLRu37vbFb4klUR5WAcB+EQAbwBuAGcAWQBhAG4AZwAuAEwAaQBAAGMAbgAuAGEAYgBiAC4AYwBvAG0ATABBAFAAVABPAFAALQBPAFUATQBLAE4AMgAzAEIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+wmUlh9FaJ3tGg0Th7GiCgEBAAAAAAAASF38+WMs0wGC8zgpW5kZqgAAAAACABYAQQBTAEkAQQBQAEEAQwBJAEYASQBDAAEAGABDAE4ALQBTAC0AVgBUAE4ARQBUADQAMAAEACYAYQBzAGkAYQBwAGEAYwBpAGYAaQBjAC4AYQBiAGIALgBjAG8AbQADAC4AQwBOAC0AUwAtAFYAVABOAEUAVAA0ADAALgBjAG4ALgBhAGIAYgAuAGMAbwBtAAUADgBhAGIAYgAuAGMAbwBtAAcACABIXfz5YyzTAQYABAACAAAACAAwADAAAAAAAAAAAQAAAAAgAABfNwvufQg0a7aFWZeALWT6tH/s3Hk2qvpGhYUj2QHPkgoAEAAAAAAAAAAAAAAAAAAAAAAACQA4AEgAVABUAFAALwBDAE4ALQBTAC0AVgBUAE4ARQBUADQAMAAuAGMAbgAuAGEAYgBiAC4AYwBvAG0AAAAAAAAAAAAAAAAAWKV’, ‘Cookie’:‘ASP.NET_SessionId=yxneiqxs55gtttbhpr2v1ccp; _gat=1; _ga=GA1.2.548323792.1505115839; _gid=GA1.2.1779367607.1505115839’ } }; 但仍无法登陆,刚开始接触node,谢谢大家了。
没有无法登陆的返回信息啊 这个是base64的 我刚才写了1个小demo实验了下,如果你不知道restrict中间件认证了什么,是会被这样的。如下: Error: Unauthorzied <—这是我自己定义new Error at restrict (E:\dispatcher\serverConnect.js:23:37) 其他信息…略 建议你贴下信息
var a =[‘user=damao&pass=a123.456’,‘sign=xxxx’]; var zipa = new Buffer(a[0]).toString(‘base64’); console.log(“转码:”,zipa); var unzipa =new Buffer(zipa,‘base64’).toString().split(’:’); console.log(“解码:”,unzipa); 你先把Authorization’给解码出来。