Accessing a given website is trivial an relies on Egress policies and Direct connection, and is usable from code repository, webhooks, functions, etc.
For example: Doing a web search via a search engine is entirely doable, as only one domain is accessed (google.com, etc.)
Accessing any websites is more complex, as a blanket access to the entire internet is usually not possible, due to strict security and audit requirements. Egress policies must be created for each specific domain or IP address and port you wish to access—wildcards or “allow all” policies are not supported. This applies whether you are using direct connection, agent proxy, or enabling external system access in a code repository.
For example: Doing a web search via a search engine and navigating to each website of the results, essentially requires a blanket access to the internet, and this appears difficult due to the above strict security requirements.
That being said, the below discuss the different options on how/where this might become blocking.
Referring to this post for verbiage around data connection:
https://community.palantir.com/t/what-is-better-between-agents-and-direct-connection-how-can-i-scale-vertically-horizontally-etc/1297
Agent
By default an Agent can access what is on its network, so it is only optionally limited if such configuration is made. See https://www.palantir.com/docs/foundry/data-connection/set-up-agent/#secure-an-agent-host. This make them good candidate for reaching a wide range of websites.
However, agent are not usable from code repository nor support webhooks.
Agent proxy
Agent proxies are close to agents, but they differ as in:
The hostname and port in the URL defined on the source restricts access to only that hostname and port when connecting to the agent proxy. Attempts to connect to any other hostname or port will result in a HTTP 403 (Unauthorized) response code from the proxy.
This makes them difficult to use for any website access.
Direct connection
As stated above, direct connection - like agent proxy - require explicit egresses to be setup to access a website/IP.
Third-party
One way to achieve this use-case is to use a third party to perform the websearch (Perplexity, OpenAI with websearches endpoints, etc.) but this should be weighted with the consequences of sharing information (query) to those third parties.
The implementation relies on standard mechanism (egress to the third party, usage in code repo or webhooks, etc.).