Content-Length: 387631 | pFad | https://github.com/mubeng/mubeng/pull/248

73 feat: add `--max-redirects` option by yudelevi · Pull Request #248 · mubeng/mubeng · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add --max-redirects option #248

Merged
merged 3 commits into from
Sep 4, 2024
Merged

Conversation

yudelevi
Copy link
Contributor

@yudelevi yudelevi commented Sep 3, 2024

IMPORTANT: Please do not create a Pull Request without creating an issue first!

(Any change needs to be discussed before proceeding. Failure to do so may result in the rejection of the pull request).

Summary

I've run into a few issues with redirects (#247, #230) that were caused by the fact that the HTTP library in go silently follows 10 redirects and fails without returning the last call.
Clients using the proxy would fail without any insight and without receiving a proper failure code. Moreover, in some use cases, the 3xx code and destination is a result by itself.

Proposed of changes

This PR adds a --max-retries option that defaults to the standard 10 redirects the HTTP library follows.
In addition to that, the last result is returned in a case of 3xx redirect.

How has this been tested?

Proof:


dyudelevich@MadMacs mubeng % make test
Testing mubeng package v0.15.3-8-g46197be
ok  	github.com/kitabisa/mubeng/pkg/mubeng	0.011s
ok  	github.com/kitabisa/mubeng/pkg/helper	0.008s
dyudelevich@MadMacs mubeng % make test-extra 
Run GolangCI-Lint
INFO golangci-lint has version 1.60.3 built with go1.23.0 from c2e095c on 2024-08-22T21:45:24Z 
INFO [config_reader] Config search paths: [./ /Users/dyudelevich/dev/mubeng /Users/dyudelevich/dev /Users/dyudelevich /Users /] 
INFO [lintersdb] Active 6 linters: [errcheck gosimple govet ineffassign staticcheck unused] 
INFO [loader] Go packages loading at mode 575 (name|deps|files|imports|types_sizes|compiled_files|exports_file) took 255.663947ms 
INFO [runner/filename_unadjuster] Pre-built 0 adjustments in 1.423605ms 
INFO [linters_context/goanalysis] analyzers took 0s with no stages 
INFO [runner] Issues before processing: 12, after processing: 2 
INFO [runner] Processors filtering stat (in/out): diff: 2/2, max_from_linter: 2/2, path_prettifier: 12/12, skip_files: 12/12, identifier_marker: 12/12, exclude: 12/12, nolint: 2/2, uniq_by_line: 2/2, path_shortener: 2/2, path_prefixer: 2/2, max_same_issues: 2/2, severity-rules: 2/2, fixer: 2/2, cgo: 12/12, filename_unadjuster: 12/12, autogenerated_exclude: 12/12, exclude-rules: 12/2, max_per_file_from_linter: 2/2, source_code: 2/2, invalid_issue: 12/12, skip_dirs: 12/12, sort_results: 2/2 
INFO [runner] processing took 1.172241ms with stages: exclude-rules: 271.921µs, identifier_marker: 211.399µs, autogenerated_exclude: 201.764µs, path_prettifier: 176.97µs, nolint: 167.24µs, source_code: 71.223µs, skip_dirs: 54.732µs, filename_unadjuster: 4.353µs, cgo: 3.798µs, uniq_by_line: 1.896µs, invalid_issue: 1.773µs, max_same_issues: 1.322µs, max_from_linter: 1.012µs, path_shortener: 919ns, fixer: 475ns, max_per_file_from_linter: 403ns, skip_files: 277ns, exclude: 210ns, sort_results: 200ns, diff: 142ns, path_prefixer: 108ns, severity-rules: 104ns 
INFO [runner] linters took 74.893599ms with stages: goanalysis_metalinter: 73.567471ms 
internal/server/init.go:9:2: SA1019: rand.Seed has been deprecated since Go 1.20 and an alternative has been available since Go 1.0: As of Go 1.20 there is no reason to call Seed with a random value. Programs that call Seed with a known value to get a specific sequence of results should use New(NewSource(seed)) to obtain a local random generator. (staticcheck)
	rand.Seed(time.Now().UnixNano())
	^
internal/proxymanager/proxymanager.go:23:2: SA1019: rand.Seed has been deprecated since Go 1.20 and an alternative has been available since Go 1.0: As of Go 1.20 there is no reason to call Seed with a random value. Programs that call Seed with a known value to get a specific sequence of results should use New(NewSource(seed)) to obtain a local random generator. (staticcheck)
	rand.Seed(time.Now().UnixNano())
	^
INFO File cache stats: 2 entries of total size 1.3KiB 
INFO Memory: 5 samples, avg is 34.4MB, max is 47.3MB 
INFO Execution took 348.222792ms                  
make: *** [golangci-lint] Error 1

fixing error above will break compatibility with <1.20

Testing below for 301 and following redirects via curl with -L flag:
curl -k -v -L -x "http://127.0.0.1:9000" https://httpbin.org/status/301 and curl -k -v -x "http://127.0.0.1:9000" https://httpbin.org/status/301


           _   
 _____ _ _| |_ ___ ___ ___
|     | | | . | -_|   | . |
|_|_|_|___|___|___|_|_|_  |
                      |___|
 infosec@kitabisa.com

2024/09/03 12:41:20 [INFO] ▶ [PID: 76761] Starting proxy server on 127.0.0.1:9000
2024/09/03 12:41:24 [DEBU] ▶ 127.0.0.1:50789 GET https://httpbin.org:443/status/301
2024/09/03 12:41:26 [DEBU] ▶ 127.0.0.1:50789 301 MOVED PERMANENTLY
2024/09/03 12:41:45 [DEBU] ▶ 127.0.0.1:50792 GET https://httpbin.org:443/status/301
2024/09/03 12:41:47 [DEBU] ▶ 127.0.0.1:50792 301 MOVED PERMANENTLY
2024/09/03 12:41:47 [DEBU] ▶ 127.0.0.1:50794 GET https://httpbin.org:443/redirect/1
2024/09/03 12:41:47 [DEBU] ▶ 127.0.0.1:50794 302 FOUND
2024/09/03 12:41:47 [DEBU] ▶ 127.0.0.1:50796 GET https://httpbin.org:443/get
2024/09/03 12:41:48 [DEBU] ▶ 127.0.0.1:50796 200 OK
^C2024/09/03 12:41:51 [WARN] ▶ Interuppted. Exiting...

Closing issues

Fixes #247 , #230

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
  • I have followed the guidelines in our CONTRIBUTING.md document.
  • I have written new tests for my changes.
  • My changes successfully ran and pass tests locally.

@dwisiswant0
Copy link
Collaborator

dwisiswant0 commented Sep 3, 2024

I'm not sure if this is an issue that's OS-specific or what, but everything was just fine in my environment prior to this PR.

image

$ git log --oneline | head -1
66705fe feat: defaulting max retries to `0`
$ cat ~/.curlrc
--silent
--insecure
--location
--compressed
--path-as-is
--user-agent "X"

@yudelevi
Copy link
Contributor Author

yudelevi commented Sep 3, 2024

That's precisely the issue, you got 200 when you should have gotten a 301. So redirects were followed by default (which is still the existing case, no change there) and that was hidden from the clients using it.

In my use-case, I'm following the redirects and extracting the redirect path in the client connecting to mubeng.

Moreover, before this PR, for redirects paths with >10 redirects it would return an error rather than the last redirect (see #230)

@dwisiswant0
Copy link
Collaborator

Here is the output of your branch:
image

That's precisely the issue, you got 200 when you should have gotten a 301.

Wdym? http.Client already follows up to 10 redirects by default. Here is the verbose output (of HEAD branch):

image

Logs:

           _   v0.15.3-6-g66705fe
 _____ _ _| |_ ___ ___ ___
|     | | | . | -_|   | . |
|_|_|_|___|___|___|_|_|_  |
                      |___|
 infosec@kitabisa.com

2024/09/04 04:40:50 [INFO] ▶ [PID: 2068829] Starting proxy server on :8989
2024/09/04 04:40:52 [DEBU] ▶ [::1]:40590 GET https://httpbin.org:443/status/301
* Request to https://httpbin.org:443/status/301
* Request from [::1]:40590
* Skipping TLS verification: connection is susceptible to man-in-the-middle attacks.
> GET /status/301 HTTP/1.1
> Host: httpbin.org
> Accept: */*
> Accept-Encoding: deflate, gzip, br, zstd
> User-Agent: X
> X-Forwarded-For: **************
> X-Forwarded-Proto: https

< HTTP/1.1 301 MOVED PERMANENTLY
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: *
< Content-Length: 0
< Date: Tue, 03 Sep 2024 21:40:53 GMT
< Location: /redirect/1
< Server: gunicorn/19.9.0

* Request to https://httpbin.org:443/redirect/1
* Skipping TLS verification: connection is susceptible to man-in-the-middle attacks.
> GET /redirect/1
> Host: httpbin.org
> Accept: */*
> Accept-Encoding: deflate, gzip, br, zstd
> Referer: https://httpbin.org:443/status/301
> User-Agent: X
> X-Forwarded-For: **************
> X-Forwarded-Proto: https

< HTTP/1.1 302 FOUND
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: *
< Content-Length: 215
< Content-Type: text/html; charset=utf-8
< Date: Tue, 03 Sep 2024 21:40:55 GMT
< Location: /get
< Server: gunicorn/19.9.0

* Request to https://httpbin.org:443/get
* Skipping TLS verification: connection is susceptible to man-in-the-middle attacks.
> GET /get
> Host: httpbin.org
> Accept: */*
> Accept-Encoding: deflate, gzip, br, zstd
> Referer: https://httpbin.org:443/redirect/1
> User-Agent: X
> X-Forwarded-For: **************
> X-Forwarded-Proto: https

< HTTP/1.1 200 OK
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: *
< Content-Length: 367
< Content-Type: application/json
< Date: Tue, 03 Sep 2024 21:40:57 GMT
< Server: gunicorn/19.9.0

2024/09/04 04:40:57 [DEBU] ▶ [::1]:40590 200 OK
^C2024/09/04 04:41:45 [WARN] ▶ Interuppted. Exiting...

@dwisiswant0
Copy link
Collaborator

Moreover, before this PR, for redirects paths with >10 redirects it would return an error rather than the last redirect (see #230)

Unless you provide an instance of the host like that to reproduce the issue, it would be better.

The URL of the issue you're referring to is just fine by me.
image

@yudelevi
Copy link
Contributor Author

yudelevi commented Sep 3, 2024

Yes, by default there is no change in behavior (otherwise it's a breaking change). But if you set max-redirects to 0, you let the client app control the redirect.

Default settings:

>>> resp=requests.get('https://httpbin.org/status/301',proxies={'http':'127.0.0.1:9000', 'https':'127.0.0.1:9000'},verify=False)
>>> resp.status_code
200
>>> resp.history
[]

by setting max-redirects to 0

>>> resp=requests.get('https://httpbin.org/status/301',proxies={'http':'127.0.0.1:9000', 'https':'127.0.0.1:9000'},verify=False)
>>> resp.status_code
200
>>> resp.history
[<Response [301]>, <Response [302]>]

Both cases made exactly 3 calls (301, 302, 200).

Another example, you might not want to follow the redirects at all in the client :
default settings:

>>> resp=requests.get('https://httpbin.org/status/301',proxies={'http':'127.0.0.1:9000', 'https':'127.0.0.1:9000'},verify=False, allow_redirects=False)
>>> resp.status_code
200

max-redirects=0

>>> resp=requests.get('https://httpbin.org/status/301',proxies={'http':'127.0.0.1:9000', 'https':'127.0.0.1:9000'},verify=False, allow_redirects=False)
>>> resp.status_code
301

@dwisiswant0
Copy link
Collaborator

dwisiswant0 commented Sep 3, 2024

I get it. But I need a URL that has >10 redirects in order to reproduce the issue and see what behavior I get by default.

@yudelevi
Copy link
Contributor Author

yudelevi commented Sep 3, 2024

for http://www.soulmatereading.com (note, it's http and not https):

in my PR

dyudelevich@MadMacs ~ % curl -k -v -x "http://127.0.0.1:9000" http://www.soulmatereading.com 
*   Trying 127.0.0.1:9000...
* Connected to 127.0.0.1 (127.0.0.1) port 9000
> GET http://www.soulmatereading.com/ HTTP/1.1
> Host: www.soulmatereading.com
> User-Agent: curl/8.4.0
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 301 Moved Permanently
< Cache-Control: max-age=0
< Content-Length: 240
< Content-Type: text/html; charset=iso-8859-1
< Date: Tue, 03 Sep 2024 22:13:50 GMT
< Expires: Tue, 03 Sep 2024 22:13:50 GMT
< Location: https://www.soulmatereading.com/
< Server: Apache
< 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://clevelandohioweatherforecast.com//pFad.php?u=https%3A%2F%2Fwww.soulmatereading.com%2F">here</a>.</p>
</body></html>
* Connection #0 to host 127.0.0.1 left intact

Head

*   Trying 127.0.0.1:9000...
* Connected to 127.0.0.1 (127.0.0.1) port 9000
> GET http://www.soulmatereading.com/ HTTP/1.1
> Host: www.soulmatereading.com
> User-Agent: curl/8.4.0
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 502 Bad Gateway
< Content-Type: text/plain
< Date: Tue, 03 Sep 2024 22:15:35 GMT
< Content-Length: 18
< 
* Connection #0 to host 127.0.0.1 left intact
Proxy server error%  

same happens with http://www.bunch.ca/about (http not https)

@dwisiswant0
Copy link
Collaborator

note, it's http and not https

I didn't notice that, lol.

Now it makes sense to me.

common/options.go Show resolved Hide resolved
common/vars.go Outdated Show resolved Hide resolved
internal/runner/options.go Outdated Show resolved Hide resolved
pkg/mubeng/mubeng.go Outdated Show resolved Hide resolved
@dwisiswant0
Copy link
Collaborator

dwisiswant0 commented Sep 3, 2024

Unresolved questions:

@yudelevi yudelevi requested a review from dwisiswant0 September 3, 2024 23:42
Copy link
Collaborator

@dwisiswant0 dwisiswant0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lesgo! Thank you, @yudelevi.

@dwisiswant0 dwisiswant0 changed the title feat: add --max-redirects option and return last result in case of redirect feat: add --max-redirects option Sep 4, 2024
@dwisiswant0 dwisiswant0 merged commit c65b80d into mubeng:master Sep 4, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/mubeng/mubeng/pull/248

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy