From 3b378675c3d10d6f96f9c4026fab4a034f7148e5 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 28 Jun 2023 18:14:31 -0500 Subject: [PATCH 1/4] Update FAQ to explain `world_readable` only (#277) Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/239 --- docs/faq.md | 66 ++++++++++++++++++++++++++--------------------------- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/docs/faq.md b/docs/faq.md index 3c5fda0..6725558 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -19,54 +19,54 @@ messages from any given date and day-by-day navigation. ## Why did the archive bot join my room? -Only public Matrix rooms with `shared` or `world_readable` [history -visibility](https://spec.matrix.org/latest/client-server-api/#room-history-visibility) are -accessible in the Matrix Public Archive. In some clients like Element, the `shared` -option equates to "Members only (since the point in time of selecting this option)" and -`world_readable` to "Anyone" under the **room settings** -> **Security & Privacy** -> -**Who can read history?**. +Only Matrix rooms with `world_readable` [history +visibility](https://spec.matrix.org/latest/client-server-api/#room-history-visibility) +are accessible in the Matrix Public Archive and indexed by search engines. But the archive bot (`@archive:matrix.org`) will join any public room because it doesn't -know the history visibility without first joining. Any room without `world_readable` or -`shared` history visibility will lead a `403 Forbidden`. And if the public room is in -the room directory, it will be listed in the archive but will still lead to a `403 -Forbidden` in that case. +know the history visibility without first joining. Any room that doesn't have +`world_readable` history visibility will lead a `403 Forbidden`. The Matrix Public Archive doesn't hold onto any data (it's stateless) and requests the messages from the homeserver every time. The [archive.matrix.org](https://archive.matrix.org/) instance has some caching in place, 5 minutes for the current day, and 2 days for past content. -The Matrix Public Archive only allows rooms with `world_readable` history visibility to -be indexed by search engines. See the [opt -out](#how-do-i-opt-out-and-keep-my-room-from-being-indexed-by-search-engines) topic -below for more details. - -### Why does the archive user join rooms instead of browsing them as a guest? - -Guests require `m.room.guest_access` to access a room. Most public rooms do not allow -guests because even the `public_chat` preset when creating a room does not allow guest -access. Not being able to view most public rooms is the major blocker on being able to -use guest access. The idea is if I can view the messages from a Matrix client as a -random user, I should also be able to see the messages in the archive. - -Guest access is also a much different ask than read-only access since guests can also -send messages in the room which isn't always desirable. The archive bot is read-only and -does not send messages. +See the [opt out +section](#how-do-i-opt-out-and-keep-my-room-from-being-indexed-by-search-engines) below +for more details. ## How do I opt out and keep my room from being indexed by search engines? -Only public Matrix rooms with `shared` or `world_readable` history visibility are -accessible to view in the Matrix Public Archive. But only rooms with history visibility -set to `world_readable` are indexable by search engines. +Only Matrix rooms with `world_readable` [history +visibility](https://spec.matrix.org/latest/client-server-api/#room-history-visibility) +are accessible in the Matrix Public Archive and indexed by search engines. One easy way +to opt-out is to change your rooms history visibility to something else if you don't +intend for your room be world readable. -Also see https://github.com/matrix-org/matrix-public-archive/issues/47 to track better -opt out controls. +Dedicated opt-out controls are being tracked in +[#47](https://github.com/matrix-org/matrix-public-archive/issues/47). -As a workaround for [archive.matrix.org](https://archive.matrix.org/) today, you can ban -the `@archive:matrix.org` user if you don't want your room content to be shown in the +As a workaround for [archive.matrix.org](https://archive.matrix.org/), you can ban the +`@archive:matrix.org` user if you don't want your room content to be shown in the archive at all. +### Why does the archive user join rooms instead peeking in the room or using guests? + +Since the archive only displays rooms with `world_readable` history visibility, we could +peek into the rooms without joining. This is being explored in +[#272](https://github.com/matrix-org/matrix-public-archive/pull/272). But peeking +doesn't work when the server doesn't know about the room already (this is commonly +referred to as federated peeking) which is why we have to fallback to joining the room +in any case. We could solve the federated peeking problem and avoid the join with +[MSC3266 room summaries](https://github.com/matrix-org/matrix-spec-proposals/pull/3266) +to check whether the room is `world_readable` even over federation. + +Guests are completely separate concept and controlled by the `m.room.guest_access` state +event in the room. Guest access is also a much different ask than read-only access since +guests can also send messages in the room which isn't always desirable. The archive bot +is read-only and does not send messages. + ## Technical details The main readme has a [technical overview](../README.md#technical-overview) of the From a79342f83c465cc74d4aa394bf9fc9b6240621df Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 28 Jun 2023 18:15:18 -0500 Subject: [PATCH 2/4] Prepare changelog with #277 --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 1f0f926..c92533e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ - Prevent join event spam with stable `reason`, https://github.com/matrix-org/matrix-public-archive/pull/268 - Don't allow previewing `shared` history rooms, https://github.com/matrix-org/matrix-public-archive/pull/239 - Contributed by [@tulir](https://github.com/tulir) +- Update FAQ to explain `world_readable` only, https://github.com/matrix-org/matrix-public-archive/pull/277 Developer facing: From 0fc4421432ffa1ba61cdad63fe407d9464a01e14 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 28 Jun 2023 20:29:49 -0500 Subject: [PATCH 3/4] Indicate when the room was set to `world_readable` and by who (#278) --- server/lib/matrix-utils/fetch-room-data.js | 7 +++++++ server/routes/room-routes.js | 5 ++++- shared/hydrogen-vm-render-script.js | 1 + shared/viewmodels/ArchiveRoomViewModel.js | 5 +++++ shared/views/RightPanelContentView.js | 24 ++++++++++++++++++++-- 5 files changed, 39 insertions(+), 3 deletions(-) diff --git a/server/lib/matrix-utils/fetch-room-data.js b/server/lib/matrix-utils/fetch-room-data.js index e1ea483..eb39c40 100644 --- a/server/lib/matrix-utils/fetch-room-data.js +++ b/server/lib/matrix-utils/fetch-room-data.js @@ -210,9 +210,15 @@ const fetchRoomData = traceFunction(async function ( } let historyVisibility; + let historyVisibilityEventMeta; if (stateHistoryVisibilityResDataOutcome.reason === undefined) { const { data } = stateHistoryVisibilityResDataOutcome.value; historyVisibility = data?.content?.history_visibility; + historyVisibilityEventMeta = { + historyVisibility, + sender: data?.sender, + originServerTs: data?.origin_server_ts, + }; } let roomCreationTs; @@ -240,6 +246,7 @@ const fetchRoomData = traceFunction(async function ( canonicalAlias, avatarUrl, historyVisibility, + historyVisibilityEventMeta, roomCreationTs, predecessorRoomId, predecessorLastKnownEventId, diff --git a/server/routes/room-routes.js b/server/routes/room-routes.js index 7417ac9..a168b8e 100644 --- a/server/routes/room-routes.js +++ b/server/routes/room-routes.js @@ -833,7 +833,10 @@ router.get( if (!allowedToViewRoom) { throw new StatusError( 403, - `Only \`world_readable\` rooms can be viewed in the archive. ${roomData.id} has m.room.history_visiblity=${roomData.historyVisibility}` + `Only \`world_readable\` rooms can be viewed in the archive. ` + + `${roomData.id} has m.room.history_visiblity=${roomData.historyVisibility} ` + + `(set by ${roomData.historyVisibilityEventMeta?.sender} on ` + + `${new Date(roomData.historyVisibilityEventMeta?.originServerTs).toISOString()})` ); } diff --git a/shared/hydrogen-vm-render-script.js b/shared/hydrogen-vm-render-script.js index 97a378c..4ef5cbe 100644 --- a/shared/hydrogen-vm-render-script.js +++ b/shared/hydrogen-vm-render-script.js @@ -118,6 +118,7 @@ async function mountHydrogen() { events, stateEventMap, shouldIndex, + historyVisibilityEventMeta: roomData.historyVisibilityEventMeta, basePath: config.basePath, }); diff --git a/shared/viewmodels/ArchiveRoomViewModel.js b/shared/viewmodels/ArchiveRoomViewModel.js index c38986f..3ae2f7e 100644 --- a/shared/viewmodels/ArchiveRoomViewModel.js +++ b/shared/viewmodels/ArchiveRoomViewModel.js @@ -75,6 +75,7 @@ class ArchiveRoomViewModel extends ViewModel { events, stateEventMap, shouldIndex, + historyVisibilityEventMeta, basePath, } = options; assert(homeserverUrl); @@ -85,6 +86,9 @@ class ArchiveRoomViewModel extends ViewModel { assert(events); assert(stateEventMap); assert(shouldIndex !== undefined); + assert(historyVisibilityEventMeta.historyVisibility); + assert(historyVisibilityEventMeta.sender); + assert(historyVisibilityEventMeta.originServerTs); assert(events); this._room = room; @@ -213,6 +217,7 @@ class ArchiveRoomViewModel extends ViewModel { shouldShowTimeSelector, timeSelectorViewModel: this._timeSelectorViewModel, shouldIndex, + historyVisibilityEventMeta, get developerOptionsUrl() { return urlRouter.urlForSegments([ navigation.segment('room', room.id), diff --git a/shared/views/RightPanelContentView.js b/shared/views/RightPanelContentView.js index fbb0bc5..3ebab71 100644 --- a/shared/views/RightPanelContentView.js +++ b/shared/views/RightPanelContentView.js @@ -10,12 +10,28 @@ class RightPanelContentView extends TemplateView { render(t, vm) { assert(vm.shouldIndex !== undefined); assert(vm.shouldShowTimeSelector !== undefined); + assert(vm.historyVisibilityEventMeta.historyVisibility); + assert(vm.historyVisibilityEventMeta.sender); + assert(vm.historyVisibilityEventMeta.originServerTs); let maybeIndexedMessage = 'This room is not being indexed by search engines '; if (vm.shouldIndex) { - maybeIndexedMessage = 'This room is being indexed by search engines '; + maybeIndexedMessage = 'This room is being indexed by search engines'; } + const historyVisibilitySender = vm.historyVisibilityEventMeta.sender; + + let historyVisibilityDisplayValue = vm.historyVisibilityEventMeta.historyVisibility; + if (vm.historyVisibilityEventMeta.historyVisibility === 'world_readable') { + historyVisibilityDisplayValue = 'world readable'; + } + + const [historyVisibilitySetDatePiece, _timePiece] = new Date( + vm.historyVisibilityEventMeta.originServerTs + ) + .toISOString() + .split('T'); + return t.div( { className: 'RightPanelContentView', @@ -33,9 +49,13 @@ class RightPanelContentView extends TemplateView { className: 'RightPanelContentView_footer', }, [ + t.p([ + `This room is accessible in the archive because it was set to ` + + `${historyVisibilityDisplayValue} by ${historyVisibilitySender} on ${historyVisibilitySetDatePiece}.`, + ]), t.p([ maybeIndexedMessage, - '(', + ' (', t.a( { className: 'external-link RightPanelContentView_footerLink', From 5de8cb4e3551dbba63f9c408ddbb165898efff5d Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 28 Jun 2023 20:30:31 -0500 Subject: [PATCH 4/4] Prepare changelog with #278 --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index c92533e..76adc26 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,7 @@ - Don't allow previewing `shared` history rooms, https://github.com/matrix-org/matrix-public-archive/pull/239 - Contributed by [@tulir](https://github.com/tulir) - Update FAQ to explain `world_readable` only, https://github.com/matrix-org/matrix-public-archive/pull/277 +- Indicate when the room was set to `world_readable` and by who, https://github.com/matrix-org/matrix-public-archive/pull/278 Developer facing: