Hello, this is Hamamoto from TIMEWELL.
"We want to roll out Claude Code company-wide. But neither code nor prompts should ever leave our perimeter." I have heard this request more than 20 times in the last six months. The first question I always ask back is the same: "Are you primarily on AWS or Google Cloud?"
Why? Because Claude Code can run not only via a direct Anthropic contract but also officially through Amazon Bedrock or Google Vertex AI. If you choose a route that lets you reuse existing IAM, KMS, PrivateLink, and CloudTrail, you can move into production with almost no new security review. Pick the wrong route and your existing controls stop working, and you may collide with data sovereignty requirements. This article walks through that decision in depth — implementation settings for Bedrock and Vertex AI, when to layer in a gateway (LiteLLM, Helicone, Kong AI Gateway), and how to satisfy Tokyo-region data residency requirements. Three code examples are included at copy-paste granularity.
This is the sixth installment of TIMEWELL's "Claude Code for Enterprise" series. After onboarding, SOC 2 / ISO 27001, and cost optimization, this entry goes deep into the infrastructure layer.
Why route through Bedrock or Vertex AI: data sovereignty, audit, and existing controls
Start with the obvious counter-question. Anthropic's direct contract is SOC 2 Type II certified and supports Zero Data Retention (ZDR). So why do enterprises still pick Bedrock or Vertex AI? It boils down to three reasons: data sovereignty, audit and governance, and reuse of existing contracts.
Data sovereignty first. Amazon Bedrock's geo-restricted Cross-Region Inference (CRIS) profiles let you confine requests and responses to a specific geography. For Japan, profiles with the "jp." prefix (for example, jp.anthropic.claude-sonnet-4-5-20250929-v1:0) have been available since October 2025, keeping inference inside Tokyo (ap-northeast-1) and Osaka (ap-northeast-3). Because traffic flows over AWS Global Network, it does not transit the public internet. For organizations watching FISC security standards or APPI (Japan's Act on the Protection of Personal Information) cross-border restrictions, this design is a powerful card to play.
Next, audit and governance. Bedrock automatically records InvokeModel, InvokeModelWithResponseStream, Converse, and ConverseStream as CloudTrail management events. Vertex AI's audit logs flow into Cloud Logging under aiplatform.googleapis.com. In other words, you can answer "who called which model and when" simply by pointing your existing SIEM (Splunk, Datadog, Sumo Logic, etc.) at the existing log destinations. No new audit pipeline required — that is the practical win. Anthropic itself recommends retaining Bedrock-routed logs for at least 30 days.
The third is reuse of existing contracts and controls. If you already have an Enterprise Discount Program with AWS, Claude Code usage rolls into your AWS invoice. You skip new vendor onboarding, credit checks, contract review, legal, and InfoSec. This is the largest practical accelerator. "Direct contracts with Anthropic take six months. With Bedrock we can use it next week" is the canonical scenario in today's enterprise environment.
Conversely, the direct contract is preferable when you want to try the latest models on launch day, or when you are an on-prem-centric organization not co-located with AWS or GCP. The decision criterion is simple: where do your existing controls live? Anchor Claude Code to the same place — that is the lowest-friction path.
Struggling with AI adoption?
We have prepared materials covering ZEROCK case studies and implementation methods.
Implementing Claude Code via AWS Bedrock: IAM, PrivateLink, and CloudTrail
Switching Claude Code to Bedrock takes two environment variables: CLAUDE_CODE_USE_BEDROCK=1 and AWS_REGION. Claude Code does not read .aws/config, so the region must be specified explicitly. Anthropic also released an interactive Bedrock setup wizard in v2.1.92 (April 2026), but for CI/CD pipelines and company-wide rollouts, locking everything down with environment variables is operationally simpler.
Here is a real-world example. In enterprise contexts, the standard is to combine IAM roles with SSO (AWS IAM Identity Center) — never hand out static access keys per developer.
# Lock down via ~/.claude/settings.json or environment variables
export CLAUDE_CODE_USE_BEDROCK=1
export AWS_REGION=ap-northeast-1
export AWS_PROFILE=claude-code-tokyo
export ANTHROPIC_DEFAULT_SONNET_MODEL=jp.anthropic.claude-sonnet-4-5-20250929-v1:0
export ANTHROPIC_DEFAULT_HAIKU_MODEL=jp.anthropic.claude-haiku-4-5-20251001-v1:0
# Auto-refresh expired SSO sessions
export AWS_SDK_LOAD_CONFIG=1
# Refresh command on the Claude Code side:
# Set "aws sso login --profile claude-code-tokyo" as awsAuthRefresh in settings.json
Two things matter. Explicitly specify the jp.-prefixed inference profile, and configure awsAuthRefresh so Claude Code re-runs SSO login automatically. Skip the latter and developers will start their day filing "401 errors, can't work" tickets.
Next, the IAM policy — least privilege. Bedrock needs three inference actions plus a couple of list actions, which translates roughly to the following Terraform:
resource "aws_iam_policy" "claude_code_bedrock" {
name = "ClaudeCodeBedrockInvoke"
description = "Claude Code経由でBedrock Anthropic Claudeを呼ぶための最小権限"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "InvokeAnthropicClaude"
Effect = "Allow"
Action = [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
"bedrock:ListFoundationModels",
"bedrock:ListInferenceProfiles"
]
Resource = [
"arn:aws:bedrock:ap-northeast-1::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0",
"arn:aws:bedrock:ap-northeast-1:*:inference-profile/jp.anthropic.claude-sonnet-4-5-20250929-v1:0",
"arn:aws:bedrock:ap-northeast-3::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0"
]
},
{
Sid = "MarketplaceForBedrockOnly"
Effect = "Allow"
Action = "aws-marketplace:Subscribe"
Resource = "*"
Condition = {
StringEquals = {
"aws-marketplace:ProductId" = "anthropic-claude-bedrock"
}
}
}
]
})
}
Listing inference profile ARNs explicitly in Resource matters especially for organizations that constrain regions via SCP (Service Control Policy). Claude Code internally translates foundation model IDs into cross-region inference profiles, and SCPs that block us-east-1 have been reported (GitHub Issue #20594) to cause 403 failures. To stay confined to Japanese regions, the safe move is to allow specifically the inference profile ARNs.
Lock down the network path with PrivateLink (VPC interface endpoints). Create both com.amazonaws.ap-northeast-1.bedrock-runtime and com.amazonaws.ap-northeast-1.bedrock. Claude Code requests then stay inside the VPC, never touching the NAT gateway. Even in environments where the internal dev network can only reach the internet through a proxy, PrivateLink reaches Bedrock directly. At roughly $0.014/hour per endpoint, the cost is negligible for the security posture you get in return.
For audit logs, the canonical pattern is CloudTrail plus CloudWatch. CloudTrail captures the API call metadata (who, when, which IAM role, which model). CloudWatch Logs captures the body of each model invocation (prompt and output, token counts). To enable CloudWatch Logs, set the destination log group via "Settings → Model invocation logging" in the Bedrock console. Anthropic recommends a minimum 30-day rolling retention, but for SOC 2 or ISO 27001 audits, 90 days to one year is safer. Use a CloudWatch Logs group encrypted with a KMS Customer-Managed Key (CMK) and you keep key management on your side as well.
Implementing Claude Code via Google Vertex AI: Service Account, VPC Service Controls, and Cloud Logging
Vertex AI is conceptually similar to Bedrock but the authorization model differs significantly. AWS centers on IAM roles; Google Cloud combines Service Accounts with org-level boundaries (VPC Service Controls). Claude Code's Vertex AI integration also gained an interactive wizard in v2.1.98, but for enterprise use, non-interactive setup via gcloud CLI and ADC (Application Default Credentials) is the common path.
The minimum environment variables are three: CLAUDE_CODE_USE_VERTEX=1, CLOUD_ML_REGION (region or global), and ANTHROPIC_VERTEX_PROJECT_ID. Configuration looks like this:
# Example env block in settings.json
env:
CLAUDE_CODE_USE_VERTEX: "1"
CLOUD_ML_REGION: "asia-northeast1"
ANTHROPIC_VERTEX_PROJECT_ID: "timewell-claude-code-prod"
GOOGLE_APPLICATION_CREDENTIALS: "/etc/secrets/claude-code-sa.json"
ANTHROPIC_DEFAULT_SONNET_MODEL: "claude-sonnet-4-5@20250929"
ANTHROPIC_DEFAULT_HAIKU_MODEL: "claude-haiku-4-5@20251001"
The role granted to the Service Account is, in principle, just roles/aiplatform.user. This role includes aiplatform.endpoints.predict (model invocation) and aiplatform.endpoints.computeTokens (token counting), which is everything Claude Code needs. Granting roles/aiplatform.admin is overkill and should be avoided for audit hygiene as well.
For the serious secure architecture on the Vertex AI side, place the Claude Code project inside a VPC Service Controls (VPC SC) perimeter. Once the perimeter is set, requests to aiplatform.googleapis.com only succeed from inside the boundary, physically blocking calls from rogue projects or personal accounts. One caveat: Claude Sonnet 4's Web Search feature does not work inside a VPC SC perimeter (because the perimeter blocks egress to the public internet). For pure code generation and repository operations there is no real impact, but agent use cases that include document search require a separate project outside the perimeter.
For audit logs, enable Cloud Audit Logs' "Data Access Logs" for aiplatform.googleapis.com. By default only "Admin Activity Logs" are captured, so enabling Data Access Logs is mandatory for enterprise use. Vertex AI also offers an optional "Request-Response Logging" feature that retains prompts and outputs for up to 30 days when enabled. This is the equivalent of Bedrock's CloudWatch Logs and is useful for abuse detection and quality auditing.
One thing to note: Vertex AI's audit logs are coarse. You can trace "which Service Account hit which endpoint when," but not "input token count per request" or "cost allocation per user." If you need strict per-developer usage management, the realistic answer is to insert a gateway layer like LiteLLM or Helicone in front, as described below.
For the network path, use Private Service Connect for Vertex AI. With PSC configured, the asia-northeast1-aiplatform.googleapis.com endpoint resolves to a private IP inside your VPC, and traffic stays on GCP's backbone. The concept is identical to AWS PrivateLink. If you only allow PSC endpoint access from your corporate VPN, you also prevent the accident of a developer hitting Vertex AI directly from their home Wi-Fi.
When and how to insert a gateway layer: LiteLLM, Helicone, Kong AI Gateway
For some organizations, connecting directly to Bedrock or Vertex AI is enough. Others want a gateway in between. Here is the decision framework.
There are three practical reasons to add a gateway. First, unified management of multiple models and providers (one API for OpenAI, Bedrock-routed Claude, Vertex AI-routed Claude, Gemini). Second, per-user cost allocation and rate limiting (departmental budgets, per-developer caps). Third, guardrails like PII detection and prompt injection protection (input inspection for GDPR / APPI compliance). When existing security mechanisms cannot cover these, a gateway earns its place.
Three options dominate the conversation now: LiteLLM, Helicone, and Kong AI Gateway.
LiteLLM is the first choice for enterprises that want to self-host. It is a Python-based self-hosted proxy that fronts 140+ providers behind an OpenAI-compatible API and ships with Virtual API Keys, Budget, SSO (Okta, Azure AD), RBAC, and audit logs. With around 40K stars on GitHub, the community is among the largest. Performance-wise, reports cite instability above ~2,000 requests per second, so for serious production runs assume horizontal scaling on Kubernetes. The cleanest pattern is to put both Bedrock and Vertex AI behind it and expose only a simple OpenAI-compatible endpoint to developers.
# Excerpt from litellm config.yaml — fronting both Bedrock and Vertex AI
model_list:
- model_name: claude-sonnet-jp-bedrock
litellm_params:
model: bedrock/jp.anthropic.claude-sonnet-4-5-20250929-v1:0
aws_region_name: ap-northeast-1
aws_role_name: arn:aws:iam::123456789012:role/litellm-bedrock
- model_name: claude-sonnet-asia-vertex
litellm_params:
model: vertex_ai/claude-sonnet-4-5@20250929
vertex_project: timewell-claude-code-prod
vertex_location: asia-northeast1
router_settings:
routing_strategy: usage-based-routing-v2
fallbacks:
- claude-sonnet-jp-bedrock: ["claude-sonnet-asia-vertex"]
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: os.environ/DATABASE_URL
enable_jwt_auth: true
Helicone is well-suited to organizations that prioritize observability. Written in Rust, it adds about 50ms of latency, can be self-hosted, and Anthropic's own Claude Code integration documentation references it. Its monitoring dashboards are more polished than LiteLLM's, and per-request cost, latency, and error rate are visible immediately. It fits the "we want to see what's happening, not control it yet" phase.
Kong AI Gateway is the option for organizations already running Kong as their API management layer. The Bedrock-targeted ai-proxy plugin drops onto an existing Kong deployment, applying authentication, rate limiting, and WAF in one shot. That said, introducing Kong purely for LLM workloads is overkill — we do not actively recommend it outside the existing Kong base.
Honestly, for startups under 50 employees, no gateway is needed. Connect directly to Bedrock or Vertex AI; CloudTrail and Cloud Audit are enough. Gateways start earning their keep at the "mid-size and larger enterprise" tier where you have multiple business units, multiple model providers, and strict PII inspection requirements. Start simple and add a gateway when the pain shows up. That sequencing causes the fewest accidents in our experience.
Data sovereignty and Tokyo-region data residency: APPI and FISC perspectives
This is the single most asked question. "We need data to stay inside Japan. Can Claude Code actually deliver that?"
The short answer: with AWS Bedrock you can build a configuration confined to Tokyo and Osaka. With Vertex AI you can build a configuration confined to asia-northeast1 (Tokyo). The Anthropic direct contract cannot guarantee this today.
Bedrock's "jp."-prefixed inference profiles (jp.anthropic.claude-sonnet-4-5-20250929-v1:0) keep requests and responses inside Tokyo (ap-northeast-1) and Osaka (ap-northeast-3). The traffic between those two regions also travels on AWS Global Network, never the public internet. For financial institutions watching FISC's "encryption of important data at rest" and "restrictions on cross-border data transfer," this is a decisive advantage. Anthropic also explicitly states that prompts and outputs sent via Bedrock are not used for Anthropic's model training.
On Vertex AI, asia-northeast1 lets you confine to a single region. Set CLOUD_ML_REGION to asia-northeast1 and place the project inside a VPC SC perimeter, and Claude Sonnet 4.5 requests never leave Tokyo. Because the perimeter is managed at the GCP organization level, even if a developer accidentally specifies a different region, traffic still cannot leave the boundary — the perimeter functions as a safety net.
For APPI's "cross-border transfer" clause, both AWS and Google Cloud's containment in Japanese regions can be treated as not constituting cross-border transfer in principle. That said, third-party-provision consent and documentation of safety management measures are still required, and that piece is realistically supplemented by gateway layers or domestic data sovereignty platforms like ZEROCK.
Personally, I think it is a waste of time to agonize over "Anthropic direct vs. Bedrock vs. Vertex AI" in this space. If your primary infrastructure is AWS, choose Bedrock; if it is GCP, choose Vertex AI; if both are mixed, anchor to whichever the bulk of your engineering organization sits on. That gives you the answer. From there, the work is the disciplined three-piece set: CloudTrail or Cloud Audit log capture, KMS or Cloud KMS CMK use, and PrivateLink or PSC. A configuration missing those three pieces will not survive an audit, no matter how loudly you say "but it's Bedrock."
Once you actually try to roll Claude Code out across an enterprise, the question shifts from "how fast can it generate code" to "what can we reconstruct from logs when an incident hits." If you have CloudTrail's InvokeModel events, full prompt text in CloudWatch Logs, Cloud Audit Logs' Data Access Logs, and Vertex AI's Request-Response Logging in place — those four — incident response works barring exceptional circumstances.
Summary: the right answer for production rollout is to lean on infrastructure
We have raced through Bedrock and Vertex AI implementations, gateways, and data sovereignty. To wrap up, the key points:
- AWS-centric → route through Bedrock: CLAUDE_CODE_USE_BEDROCK=1, jp.-prefixed inference profile, PrivateLink, CloudTrail, CMK-encrypted log group, SSO + IAM roles — that is the standard form
- GCP-centric → route through Vertex AI: CLAUDE_CODE_USE_VERTEX=1, asia-northeast1, Service Account (roles/aiplatform.user), VPC Service Controls perimeter, Data Access Logs enabled, PSC — that is the standard form
- Add a gateway from mid-size up: LiteLLM (cost and RBAC focused), Helicone (observability focused), Kong AI Gateway (existing Kong integration) — three viable choices
- Data sovereignty: Between Bedrock's "jp." prefix and Vertex AI's asia-northeast1, the option to confine traffic to Tokyo is now well-supported
- The first three-piece set: CloudTrail/Cloud Audit logs, KMS/Cloud KMS CMK, PrivateLink/PSC private path. A configuration missing any of these three will be flagged in audit, every time
At TIMEWELL we offer ZEROCK, an enterprise AI platform. It runs on AWS Japan-region servers and delivers GraphRAG-based knowledge control, organization-wide prompt library management, and integration with agentic AI like Claude Code in a single design. For Claude Code rollouts where "building everything in-house is unrealistic" or "we just need to get to production fast," ZEROCK can package the audit, knowledge, and gateway layers for you. In our WARP AI consulting practice, we walk alongside Bedrock and Vertex AI migrations on a monthly cadence, from current-state assessment to operational design.
Claude Code is a useful tool, but at the enterprise level you have to decide "which infrastructure to anchor it to" before you decide on the tool. Once you pick infrastructure, IAM, logging, and encryption all extend naturally from what you already have. Skip this and you will be rebuilding six months later. Design first.
For related reading, see the overall picture of Claude Code enterprise adoption, implementation patterns for parallel agent development, and the knowledge-control perspective.
References
[^1]: Anthropic, "Claude Code on Amazon Bedrock", https://code.claude.com/docs/en/amazon-bedrock [^2]: Anthropic, "Claude Code on Google Vertex AI", https://code.claude.com/docs/en/google-vertex-ai [^3]: AWS, "Introducing Amazon Bedrock cross-Region inference for Claude Sonnet 4.5 and Haiku 4.5 in Japan and Australia", https://aws.amazon.com/blogs/machine-learning/introducing-amazon-bedrock-cross-region-inference-for-claude-sonnet-4-5-and-haiku-4-5-in-japan-and-australia/ [^4]: AWS, "Monitor Amazon Bedrock API calls using CloudTrail", https://docs.aws.amazon.com/bedrock/latest/userguide/logging-using-cloudtrail.html [^5]: Google Cloud, "Vertex AI audit logging information", https://docs.cloud.google.com/vertex-ai/docs/general/audit-logging [^6]: AWS Solutions Library, "Guidance for Claude Code with Amazon Bedrock", https://github.com/aws-solutions-library-samples/guidance-for-claude-code-with-amazon-bedrock [^7]: BerriAI, "LiteLLM: Python SDK, Proxy Server (LLM Gateway)", https://github.com/BerriAI/litellm [^8]: Helicone, "Claude Code with Helicone", https://docs.helicone.ai/integrations/anthropic/claude-code
![Claude Code Enterprise Secure Architecture | Gateway Design via AWS Bedrock and Google Vertex AI [2026 Latest]](/images/columns/claude-code-bedrock-vertex-secure-architecture/cover.png)