feat: redact sensitive feed data in structured logs#903
Conversation
There was a problem hiding this comment.
Pull request overview
This PR tightens security around request-scoped structured logging by redacting sensitive feed tokens and replacing logged source URLs with sanitized metadata, while consolidating observability/security emission through a shared JSON logger (including rack-timeout).
Changes:
- Introduces
AppLogger,LogEvent, andLogSanitizerto centralize structured logging and sanitize sensitive fields. - Updates
ObservabilityandSecurityLoggerto emit through the shared structured logger. - Redacts
/api/v1/feeds/:tokenin request context and routes rack-timeout logs through the same JSON formatter.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| spec/html2rss/web/request_context_middleware_spec.rb | Adds coverage for redacting feed tokens in request context path. |
| spec/html2rss/web/log_sanitizer_spec.rb | New specs for path redaction, URL sanitization, and log formatting behavior. |
| app/web/telemetry/observability.rb | Switches observability emission to the shared LogEvent emitter. |
| app/web/telemetry/log_sanitizer.rb | Adds sanitizers for feed-token paths and URL fields in log details. |
| app/web/telemetry/log_event.rb | Introduces a shared emitter that merges request context + sanitized payload. |
| app/web/telemetry/app_logger.rb | Adds a shared JSON logger/formatter (JSON + logfmt parsing). |
| app/web/security/security_logger.rb | Routes security events through LogEvent and shared logger state. |
| app/web/request/request_context_middleware.rb | Redacts feed tokens when building request context. |
| app/web/boot/setup.rb | Wires rack-timeout logging to use the shared JSON logger. |
| def sanitize_path(path) | ||
| return if path.nil? | ||
|
|
||
| path_string = path.to_s | ||
| suffix = feed_suffix(path_string) | ||
| token_path = suffix ? path_string.delete_suffix(suffix) : path_string | ||
|
|
||
| token_path.gsub(FEED_TOKEN_ROUTE, "\\1[REDACTED]#{suffix}") | ||
| end |
There was a problem hiding this comment.
sanitize_path always strips a .json/.xml/.rss suffix before attempting the feed-token replacement. If the path ends with one of those suffixes but does not match the /api/v1/feeds/:token pattern, the method returns the suffix-stripped path, which will corrupt logged paths (e.g., /api/v1/health.json -> /api/v1/health). Consider matching the full feed-token route (including an optional suffix) and only redacting when that match succeeds, otherwise return the original path_string unchanged.
|
|
||
| RSpec.describe Html2rss::Web::LogSanitizer do | ||
| let(:io) { StringIO.new } | ||
| let(:logger) { Logger.new(io).tap { |log| log.formatter = Html2rss::Web::AppLogger.send(:method, :format_entry) } } |
There was a problem hiding this comment.
This spec sets the logger formatter via Html2rss::Web::AppLogger.send(:method, :format_entry), but format_entry is a private singleton method in AppLogger. method(:format_entry) typically raises NameError for private methods, so this can fail when running the spec. Prefer private_method(:format_entry) (or expose a small public helper on AppLogger intended for tests).
| let(:logger) { Logger.new(io).tap { |log| log.formatter = Html2rss::Web::AppLogger.send(:method, :format_entry) } } | |
| let(:logger) { Logger.new(io).tap { |log| log.formatter = Html2rss::Web::AppLogger.send(:private_method, :format_entry) } } |
| # @param url [String] | ||
| # @return [Hash{Symbol=>String}] | ||
| def sanitized_url(host, url) | ||
| { host:, scheme: 'https', hash: url_hash(url) } | ||
| end |
There was a problem hiding this comment.
The helper def sanitized_url(host, url) defined later in this spec overrides the earlier let(:sanitized_url) helper method. After this definition, any call to sanitized_url without arguments (e.g. eq(url: sanitized_url)) will raise an ArgumentError. Rename one of these helpers (e.g., expected_sanitized_url for the let, or build_sanitized_url for the helper) to avoid the method name collision.
🤖 I have created a release *beep* *boop* --- ## [1.1.0](html2rss-web-v1.0.0...html2rss-web/v1.1.0) (2026-05-01) ### Features * add help text on error page ([eeee345](eeee345)), closes [#338](#338) * add routed frontend feed creation workflow ([#963](#963)) ([2d1b71a](2d1b71a)) * **auto_source:** add support for `auto_source` feature ([#676](#676)) ([531dced](531dced)) * default browserless onboarding and request strategies ([#895](#895)) ([377cff0](377cff0)) * **deps:** use html2rss in latest development status ([#728](#728)) ([5885d1d](5885d1d)) * **docker:** switch to alpine 21 ([7adcc89](7adcc89)) * **docker:** upgrade to use ruby 3.3 image ([ceafe24](ceafe24)) * **docker:** use multilayer build to cut image size in half ([2f6e322](2f6e322)) * **docker:** use Ruby 3.4 ([4f7d795](4f7d795)) * **frontend:** polish result experience and validation tooling ([#964](#964)) ([b11665e](b11665e)) * **frontend:** relaunch the app with a focused v1 flow ([e0692d7](e0692d7)) * **frontend:** unify feed/result state flow ([#943](#943)) ([6dfa1a9](6dfa1a9)) * **health_check:** add HTTP Basic authentication to `GET /health_check.txt` ([#559](#559)) ([d0ccd83](d0ccd83)) * improve example feed config in feed.yml and link to it ([#552](#552)) ([de08695](de08695)) * install Gemfile.lock specified bundler version ([4190160](4190160)) * integrate request_service and use ssrf_filter strategy by default ([#707](#707)) ([b7516fd](b7516fd)) * link included feeds to the instance feed directory ([#901](#901)) ([51ce79a](51ce79a)) * optionally allow APM using Sentry via env variable ([#696](#696)) ([94477d5](94477d5)) * redact sensitive feed data in structured logs ([#903](#903)) ([ee7df73](ee7df73)) * remove dependency on activesupport ([048cb73](048cb73)) * **runtime:** rebuild feed and api behavior around typed v1 services ([b61602d](b61602d)) * simplify feed creation contract & backend error handling ([#962](#962)) ([dfca027](dfca027)) * stabilize public http interface & slimmer docker ([#882](#882)) ([fe3f4be](fe3f4be)) * unify web and feed result surfaces ([#896](#896)) ([e747b23](e747b23)) * use parallel processing for feed retrieval in health_check.rb ([#665](#665)) ([4a24997](4a24997)) ### Bug Fixes * ArgumentError when RACK_TIMEOUT_SERVICE_TIMEOUT env var is set ([96acbab](96acbab)), closes [#527](#527) * **auto_source:** respect headers from global config ([#691](#691)) ([3e9ba91](3e9ba91)) * **build:** only cleanup when there is a test container ([f7bafa6](f7bafa6)) * caching with dynamic parameters yields incorrect rss ([#589](#589)) ([bb945c2](bb945c2)), closes [#587](#587) * **ci:** repair Ruby, OpenAPI, and frontend checks ([#880](#880)) ([ec6673b](ec6673b)) * defects for token/retry/loading UX ([#924](#924)) ([2d38633](2d38633)) * **docker:** missing curl installation for health check ([0bd9157](0bd9157)) * example feed in config/feeds.yml broken ([#664](#664)) ([b961897](b961897)) * **frontend:** preserve created feeds when preview loading fails ([#915](#915)) ([383ecc3](383ecc3)) * **frontend:** streamline web ux ([#916](#916)) ([85e79bf](85e79bf)) * harden container config defaults ([392997c](392997c)) * healthcheck broken due to missing curl ([c97e746](c97e746)) * keep unknown api v1 paths inside the api contract ([a820478](a820478)) * responds with http status 422 ([#738](#738)) ([ad9394c](ad9394c)) * **runtime:** polish relaunch smoke behavior and health checks ([65e1644](65e1644)) * stylesheets not included in feed ([#779](#779)) ([9116d9d](9116d9d)) * tzdata package not installed but required for tz conversion ([#663](#663)) ([55814d2](55814d2)) * **web:** harden feed reader fallback and rss rendering ([#944](#944)) ([438d9f6](438d9f6)) * **web:** harden observability env handling and Sentry log redaction ([#917](#917)) ([ed2b3e9](ed2b3e9)) ### Performance Improvements * enable YJIT ([729f31f](729f31f)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [1.2.0](v1.1.0...v1.2.0) (2026-05-01) ### Features * add help text on error page ([eeee345](eeee345)), closes [#338](#338) * add routed frontend feed creation workflow ([#963](#963)) ([2d1b71a](2d1b71a)) * **auto_source:** add support for `auto_source` feature ([#676](#676)) ([531dced](531dced)) * default browserless onboarding and request strategies ([#895](#895)) ([377cff0](377cff0)) * **deps:** use html2rss in latest development status ([#728](#728)) ([5885d1d](5885d1d)) * **docker:** switch to alpine 21 ([7adcc89](7adcc89)) * **docker:** upgrade to use ruby 3.3 image ([ceafe24](ceafe24)) * **docker:** use multilayer build to cut image size in half ([2f6e322](2f6e322)) * **docker:** use Ruby 3.4 ([4f7d795](4f7d795)) * **frontend:** polish result experience and validation tooling ([#964](#964)) ([b11665e](b11665e)) * **frontend:** relaunch the app with a focused v1 flow ([e0692d7](e0692d7)) * **frontend:** unify feed/result state flow ([#943](#943)) ([6dfa1a9](6dfa1a9)) * **health_check:** add HTTP Basic authentication to `GET /health_check.txt` ([#559](#559)) ([d0ccd83](d0ccd83)) * improve example feed config in feed.yml and link to it ([#552](#552)) ([de08695](de08695)) * install Gemfile.lock specified bundler version ([4190160](4190160)) * integrate request_service and use ssrf_filter strategy by default ([#707](#707)) ([b7516fd](b7516fd)) * link included feeds to the instance feed directory ([#901](#901)) ([51ce79a](51ce79a)) * optionally allow APM using Sentry via env variable ([#696](#696)) ([94477d5](94477d5)) * redact sensitive feed data in structured logs ([#903](#903)) ([ee7df73](ee7df73)) * remove dependency on activesupport ([048cb73](048cb73)) * **runtime:** rebuild feed and api behavior around typed v1 services ([b61602d](b61602d)) * simplify feed creation contract & backend error handling ([#962](#962)) ([dfca027](dfca027)) * stabilize public http interface & slimmer docker ([#882](#882)) ([fe3f4be](fe3f4be)) * unify web and feed result surfaces ([#896](#896)) ([e747b23](e747b23)) * use parallel processing for feed retrieval in health_check.rb ([#665](#665)) ([4a24997](4a24997)) ### Bug Fixes * ArgumentError when RACK_TIMEOUT_SERVICE_TIMEOUT env var is set ([96acbab](96acbab)), closes [#527](#527) * **auto_source:** respect headers from global config ([#691](#691)) ([3e9ba91](3e9ba91)) * **build:** only cleanup when there is a test container ([f7bafa6](f7bafa6)) * caching with dynamic parameters yields incorrect rss ([#589](#589)) ([bb945c2](bb945c2)), closes [#587](#587) * **ci:** repair Ruby, OpenAPI, and frontend checks ([#880](#880)) ([ec6673b](ec6673b)) * **ci:** robustly parse release tags and align config ([#972](#972)) ([2efd6ef](2efd6ef)) * defects for token/retry/loading UX ([#924](#924)) ([2d38633](2d38633)) * **docker:** missing curl installation for health check ([0bd9157](0bd9157)) * example feed in config/feeds.yml broken ([#664](#664)) ([b961897](b961897)) * **frontend:** preserve created feeds when preview loading fails ([#915](#915)) ([383ecc3](383ecc3)) * **frontend:** streamline web ux ([#916](#916)) ([85e79bf](85e79bf)) * harden container config defaults ([392997c](392997c)) * healthcheck broken due to missing curl ([c97e746](c97e746)) * keep unknown api v1 paths inside the api contract ([a820478](a820478)) * responds with http status 422 ([#738](#738)) ([ad9394c](ad9394c)) * **runtime:** polish relaunch smoke behavior and health checks ([65e1644](65e1644)) * stylesheets not included in feed ([#779](#779)) ([9116d9d](9116d9d)) * tzdata package not installed but required for tz conversion ([#663](#663)) ([55814d2](55814d2)) * **web:** harden feed reader fallback and rss rendering ([#944](#944)) ([438d9f6](438d9f6)) * **web:** harden observability env handling and Sentry log redaction ([#917](#917)) ([ed2b3e9](ed2b3e9)) ### Performance Improvements * enable YJIT ([729f31f](729f31f)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Summary
Verification
Notes