Previously was seeing some failures like this locally
```
1 failing
1) matrix-viewer
Matrix Viewer
Room directory
pagination is seamless:
AssertionError [ERR_ASSERTION]: Make sure we saw all visible rooms paginating through the directory
+ expected - actual
"planet-1689366398300-room-29"
"planet-1689366398300-room-31"
"planet-1689366398300-room-32"
"planet-1689366398300-room-34"
- "planet-1689366398300-room-34"
"planet-1689366398300-room-35"
"planet-1689366398300-room-37"
"planet-1689366398300-room-38"
"planet-1689366398300-room-4"
at Context.<anonymous> (test/e2e-tests.js:2835:16)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
```
Happens to address part of https://github.com/matrix-org/matrix-public-archive/issues/271
but made primarily as a follow-up to https://github.com/matrix-org/matrix-public-archive/pull/239
---
Only 42% rooms on the `matrix.org` room directory are `world_readable` which means we will get pages of rooms that are half-empty most of the time if we just naively fetch 9 rooms at a time.
Ideally, we would be able to just add a filter directly to `/publicRooms` in order to only grab the `world_readable` rooms and still get full pages but the filter option doesn't allow us to slice by `world_readable` history visibility.
Instead, we have to paginate until we get a full grid of 9 rooms, then make a final `/publicRooms` request to backtrack to the exact continuation point so next page won't skip any rooms in between.
---
We had empty spaces in the grid before because some rooms in the room directory are private which we filtered out before. But that was a much more rare experience since only 2% of rooms were private .
Only `world_readable` can be considered as opting into having history publicly on the web. Anything else must not be archived until there's a dedicated state event for opting into archiving.
Set `X-Date-Temporal-Context: [past|present|future]` header for easy cache rules:
- Cache `past` things heavily
- Cache `present`/`future` things for 5 minutes
This accomplishes the goal we set out for:
> - We can cache all responses except for the latest UTC day (and anything in the future). ex. `/!aMzLHLvScQCGKDNqCB:gitter.im/date/2022/10/13`
> - For the latest day, we could set the cache expire after 5 minutes or so
>
> *-- [Matrix Public Archive deployment issue](https://github.com/vector-im/sre-internal/issues/2079)*
And this way we don't have to do any fancy date parsing and comparison from the URL which is probably not even possible Cloudflare cache rules.
Fix https://github.com/matrix-org/matrix-public-archive/issues/59
Other updates:
- Update tests to use `/roomid/room1/date/2022/01/03` format instead of trying to retrofit the weird alias stuff on there. Which also makes the fancy to actual URL utilities much more simple.
- Update to specify `archiveMessageLimit` in the test case because pages have different number of events depending on if we are against a boundary, hidden events, etc.
`"no-unused-vars": ["error", { "destructuredArrayIgnorePattern": "^_" }],` was only [introduced in `eslint@8.11.0`](0fd6bb213a/CHANGELOG.md) so we had to update
- Less test bulk
- Single source of truth: there is no mismatch between the comment and the expectations (we already caught a few mistakes in the conversion thanks to this benefit)
- Easier to maintain and update
- Fix https://github.com/matrix-org/matrix-public-archive/issues/7
- A URL with time looks like
- `/r/too-many-messages-on-day:my.synapse.server/date/2022/11/16T23:59`
- Or when more precision is required (seconds): `/r/too-many-messages-on-day:my.synapse.server/date/2022/11/16T23:59:59`
- Add new custom time picker/scrubber (pictured below) with momentum scrubbing
- Native built-in `<input type="time">` for easier picking if you prefer that and accessibility.
- Uses localized time strings
- Design inspired by Thiago Sanchez's *Time Zone Translate* concept, https://dribbble.com/shots/14590546-Time-Zone-Translate
Fix https://github.com/matrix-org/matrix-public-archive/issues/46
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/71
Summary:
- Changes the "Jump to next activity in room" to actually continue you to the next 100 messages ahead. Previously, it only jumped you to the single next event in the room which meant a lot of backwards overlap each time.
- Jumping this direction will also start your scroll position at the top of the timeline to continue reading seamlessly `?continue=top`
- Adds "Jump to previous activity in room" to the top of the timeline to continue reading the previous part of the conversation.
[1]: There is a caveat with seamless here which is also commented on in the code:
> XXX: This is flawed in the fact that when we go `/messages?dir=b` it could backfill messages which will fill up the response before we perfectly connect and continue from the position they were jumping from before. When `/messages?dir=f` backfills, we won't have this problem anymore because any messages backfilled in the forwards direction would be picked up the same going backwards.
(need forwards fill MSC)
Also does friendly redirects if you don't exactly use the right URL pattern.
For example, if you paste the full room ID with the `!` like `/roomid/!foo:bar`,
it will properly redirect you to `/roomid/foo:bar`. It also does this sort of
thing for URL encoded room ID's and aliases.
Fix https://github.com/matrix-org/matrix-public-archive/issues/25
Page-load with the correct homeserver selected (according to `?homeserver`).
Fix https://github.com/matrix-org/matrix-public-archive/issues/92
Also makes sure that the `?homeserver` is always available somewhere in the list; whether that be in the available homeserver list or the added homeserver list depending on it someone cleared it out or never had it because they visited from someone else's link.
1. Add surrounding messages to the given messages so we have a full screen of content to make it feel lively even in quiet rooms
- As you scroll around the timeline across different days, the date changes in the URL, calendar, etc
2. Add summary item to the bottom of the timeline that explains if we couldn't find any messages in the specific day requested
- Also allows you to the jump to the next activity in the room. Adds `/:roomId/jump?ts=xxx&dir=[f|b]` to facilitate this.
- Part of https://github.com/matrix-org/matrix-public-archive/issues/46
3. Add developer options modal which is linked from the bottom of the right-panel
- Adds an option so you can debug the `IntersectionObserver` and how it's selecting the active day from the top-edge of the scroll viewport.
- In the future, this will also include a nice little visualization of the backend timing traces
Add test to make sure the archive doesn't fail when event for event relation is missing and not included in list of provided events. Like if someone is replying to an event that was from long ago out of our range.
In the case of missing relations, Hydrogen does `_loadContextEntryNotInTimeline` because it can't find the event locally which throws an `uncaughtException`. Before https://github.com/matrix-org/matrix-public-archive/pull/51, the `uncaughtException` killed the Hydrogen `child_process` before it could pass back the HTML. Now this PR mainly just adds a test to make sure it works.
```
TypeError: Cannot read properties of undefined (reading 'storeNames')
at TimelineReader.readById (hydrogen-web\target\lib-build\hydrogen.cjs.js:12483:33)
at Timeline._getEventFromStorage (hydrogen-web\target\lib-build\hydrogen.cjs.js:12762:46)
at Timeline._loadContextEntryNotInTimeline (hydrogen-web\target\lib-build\hydrogen.cjs.js:12747:35)
at Timeline._loadContextEntriesWhereNeeded (hydrogen-web\target\lib-build\hydrogen.cjs.js:12741:14)
at Timeline.addEntries (hydrogen-web\target\lib-build\hydrogen.cjs.js:12699:10)
at mountHydrogen (4-hydrogen-vm-render-script.js:204:12)
at 4-hydrogen-vm-render-script.js:353:1
at Script.runInContext (node:vm:139:12)
at _renderHydrogenToStringUnsafe (matrix-public-archive\server\hydrogen-render\3-render-hydrogen-to-string-unsafe.js:102:41)
at async process.<anonymous> (matrix-public-archive\server\hydrogen-render\2-render-hydrogen-to-string-fork-script.js:18:27)
```
1. Build test homeserver Docker images which can federate with each other
2. Run end-to-end (e2e) tests
#### Dev notes
Sharing variables across jobs when the `services` field can't access the `env` context, https://github.community/t/how-to-use-env-with-container-image/17252/24
```yaml
env:
FOO: bar
jobs:
set_env:
outputs:
var: ${{ steps.save_var.outputs.var }}
steps:
- id: save_var
run: echo "::set-output name=var::${{ env.FOO }}"
actual_job:
needs: set_env
container:
image: ...whatever_you_need_here...${{ needs.set_env.outputs.var }}
```
Remove `matrix-bot-sdk` usage in tests because it didn't have timestamp massaging `?ts` and it's not really necessary to rely on since we can just call the API directly 🤷. `matrix-bot-sdk` is also very annoying having to build rust crypto packages.
We're now using direct `fetch` requests against the Matrix API and lightweight `client` object.
All 3 current tests pass ✅