Commit Graph

75 Commits

Author SHA1 Message Date
Eric Eastwood 6568fc7102
Add Apache 2.0 license (#55) 2022-08-30 18:35:36 -05:00
Eric Eastwood b81df10c8e
Use JSON5 for configuration files with comments (#52)
Use JSON5 for configuration files with comments. Now we can leave the available config in `config.default.json` without having to add weird instructions to remove the `xxx`, etc

 - https://www.npmjs.com/package/json5
 - https://www.npmjs.com/package/nconf
 - https://github.com/indexzero/nconf/issues/113#issuecomment-69999413
2022-08-29 20:33:02 -05:00
Eric Eastwood 36925cd603
Add test to make sure the archive doesn't fail when event for event relation is missing and not included in list of provided events (#43)
Add test to make sure the archive doesn't fail when event for event relation is missing and not included in list of provided events. Like if someone is replying to an event that was from long ago out of our range.

In the case of missing relations, Hydrogen does `_loadContextEntryNotInTimeline` because it can't find the event locally which throws an `uncaughtException`. Before https://github.com/matrix-org/matrix-public-archive/pull/51, the `uncaughtException` killed the Hydrogen `child_process` before it could pass back the HTML. Now this PR mainly just adds a test to make sure it works.
```
TypeError: Cannot read properties of undefined (reading 'storeNames')
    at TimelineReader.readById (hydrogen-web\target\lib-build\hydrogen.cjs.js:12483:33)
    at Timeline._getEventFromStorage (hydrogen-web\target\lib-build\hydrogen.cjs.js:12762:46)
    at Timeline._loadContextEntryNotInTimeline (hydrogen-web\target\lib-build\hydrogen.cjs.js:12747:35)
    at Timeline._loadContextEntriesWhereNeeded (hydrogen-web\target\lib-build\hydrogen.cjs.js:12741:14)
    at Timeline.addEntries (hydrogen-web\target\lib-build\hydrogen.cjs.js:12699:10)
    at mountHydrogen (4-hydrogen-vm-render-script.js:204:12)
    at 4-hydrogen-vm-render-script.js:353:1
    at Script.runInContext (node:vm:139:12)
    at _renderHydrogenToStringUnsafe (matrix-public-archive\server\hydrogen-render\3-render-hydrogen-to-string-unsafe.js:102:41)
    at async process.<anonymous> (matrix-public-archive\server\hydrogen-render\2-render-hydrogen-to-string-fork-script.js:18:27)
```
2022-08-29 19:42:18 -05:00
Eric Eastwood bdaa98e722
Make the `child_process` error catching more robust (`uncaughtException`) (#51)
Split off from https://github.com/matrix-org/matrix-public-archive/pull/43

Listen to `process.on('uncaughtException', ...)` and handle the async errors ourselves so it no longer fails the child process.

And if the process does exit with status code 1 (error), we have those underlying errors serialized and shown.
2022-08-29 19:13:56 -05:00
Eric Eastwood e9d13db911
Add test for joining a new federated room (#31)
Add test for joining a new federated room and making sure the messages are available (homeserver should backfill).

Synapse changes: https://github.com/matrix-org/synapse/pull/13205, https://github.com/matrix-org/synapse/pull/13320
2022-08-29 18:56:31 -05:00
Eric Eastwood b5b79b94f2
Manually instrument some archive logic (#44) 2022-08-29 14:13:13 -05:00
Eric Eastwood 27886a92d3
Add available Jaeger port (#48) 2022-08-29 14:08:15 -05:00
Eric Eastwood fba827a44e
Add app screenshot (#42) 2022-07-18 13:29:47 -05:00
Eric Eastwood 07bc094890
Enable tracing by config so we can enable from argv, env variable, or config file (#41) 2022-07-14 11:26:53 -05:00
Eric Eastwood ddfe94beab
OpenTelemetry tracing so we can see spans where the app is taking time (#27)
OpenTelemetry tracing so we can see spans where the app is taking time.
For the user, we specifically show the spans for the external API HTTP requests
that are slow (so we know when the Matrix API is being slow).

Enable tracing:

 - `npm run start -- --tracing`
 - `npm run start-dev -- --tracing`

What does this PR change:

 - Adds OpenTelemetry tracing with some of the automatic instrumentation (includes HTTP and express)
    - We ignore traces for serving static assets (just noise)
 - Adds `X-Trace-Id` to the response headers
 - Adds `window.tracingSpansForRequest` which includes the external HTTP API requests made during the request
 - Adds a fancy 504 timeout page that includes trace details and lists the slow HTTP requests
 - Adds `jaegerTracesEndpoint` configuration to export tracing spans to Jaeger
 - Related to, https://github.com/matrix-org/matrix-public-archive/issues/26
2022-07-14 11:08:50 -05:00
Eric Eastwood 13eb92b067
Make sure we finish sending the HTML payload before we exit the process (#38)
I encountered a page which responded successfully but all of the Hydrogen HTML was missing. It just had the boilerplate around it.

What I am guessing happened is that since `process.send` is async, with a sufficiently large
payload and race condition, `process.exit(0)` was being called before it finished sending.

Related:
 - https://stackoverflow.com/questions/34627546/process-send-is-sync-async-on-nix-windows
 - 56d9584a0e
 - https://github.com/nodejs/node/issues/6767
2022-07-06 19:24:29 -05:00
Eric Eastwood 7eaa103a28
Fix large option payloads throwing E2BIG and ENAMETOOLONG (#37)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/36
2022-07-05 18:00:29 -05:00
Eric Eastwood f738dbc1da
Stop Hydrogen from running in the background after we get our SSR HTML render data (#36)
We now run the Hydrogen render in a `child_process` so we can exit the whole render process. We still use the `vm` to setup the browser-like globals. With a `vm`, everything continues to run even after it returns and there isn't a way to clean up, stop, kill, terminate the vm script or context so we need this extra `child_process` now to clean up. I don't like the complexity necessary for this though. I wish the `vm` API allowed for this use case. The only way to stop a `vm` is the `timeout` and we want to stop as soon as we return.

Fix https://github.com/matrix-org/matrix-public-archive/issues/34
2022-07-05 17:30:52 -05:00
Eric Eastwood 9c8f980e2a Remove random log 2022-07-05 11:21:04 -05:00
Eric Eastwood 17f2c399dd
Expose arguments for `renderHydrogenToString` (pure function) so we can reproduce when error occurs while we render Hydrogen (#35)
`renderHydrogenToString` is a pure function (probably) which means it will give the same output given the same input. This means, that if we give it a certain input and an error occurs, we should be able to reproduce it again if we have the arguments. This PR exposes those arguments in the logged error so we can investigate what's going wrong.

Added so we can investigate https://github.com/matrix-org/matrix-public-archive/issues/34 better and reproduce locally.
2022-07-05 10:30:12 -05:00
Eric Eastwood d508521171
Use stable test selectors from Hydrogen (`data-testid`)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/29#discussion_r909492946

---

Updated Hydrogen to add the consistent `data-testid` attribute selectors,
https://github.com/vector-im/hydrogen-web/pull/773

```
npm install hydrogen-view-sdk@npm:@mlm/hydrogen-view-sdk@0.0.13-scratch
```
2022-07-05 06:23:47 -05:00
Eric Eastwood cfbd6182a9 Explain why parallel 2022-06-29 13:57:06 +02:00
Eric Eastwood 4690349816
Remove unneeded `include_redundant_members` from `/messages` `filter` and test that member state is still visible (#29)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/28#discussion_r909428366
2022-06-29 06:56:13 -05:00
Eric Eastwood 57174db6e0
Fix archive not responding in big rooms because homeserver times out on `/context` request
Add `filter={"lazy_load_members":true}` so that `/context` responds without timing out by returning just the state for the sender of the included event. Otherwise, the homeserver returns all state in the room at that point in time which in big rooms, can be 100k member events that we don't care about anyway. Synapse seems to timeout at about the ~5k state event mark.


## Dev notes

Without `filter={"lazy_load_members":true}`, Synapse can only handle ~4k member events (probably state in general) before `/context` times out before it ever responds in the 180 second window. I'm only looking at the member count as a rough proxy but the number of member events + state will be more.


Loads | Member count | Alias | Matrix.to | API request
--- | --- | --- | --- | ---
 | 39k | `#matrix:matrix.org` | [🔗 ](https://matrix.to/#/!OGEhHVWSdvArJzumhm:matrix.org/$xo-tESRNP1Vg1RxLXIfmMeO6dA6-u9XuE2lv6toeKcw?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!OGEhHVWSdvArJzumhm:matrix.org/context/$xo-tESRNP1Vg1RxLXIfmMeO6dA6-u9XuE2lv6toeKcw?limit=0`
 | 31k |  `#openwisp_general:gitter.im` | [🔗 ](https://matrix.to/#/!RBzfoBeqYcCwLAAenz:gitter.im/$YFjmbmH0NRPmfVsPyxhM0jcK4RFR_CdCigtHSgCTLSc?via=matrix.org) |`https://matrix-client.matrix.org/_matrix/client/r0/rooms/!RBzfoBeqYcCwLAAenz:gitter.im/context/$YFjmbmH0NRPmfVsPyxhM0jcK4RFR_CdCigtHSgCTLSc?limit=0`
 | 16k | `#raspberrypi:matrix.org` | [🔗 ](https://matrix.to/#/!wOlkWNmgkAZFxbTaqj:matrix.org/$WMh_QauoyW6cjFDuUeNgsnUSgDn7C2XlrUnlijhdmdk?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!wOlkWNmgkAZFxbTaqj:matrix.org/context/$WMh_QauoyW6cjFDuUeNgsnUSgDn7C2XlrUnlijhdmdk?limit=0`
Loads with warm cache | 7.2k | `#element-android:matrix.org` | [🔗 ](https://matrix.to/#/!AZozoWghOYSIAfaZjJ:matrix.org/$Qmb8vmaD91PIM6ROfh-2jApPoJgH2Q7NnjiwRdiEPZE?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!AZozoWghOYSIAfaZjJ:matrix.org/context/$Qmb8vmaD91PIM6ROfh-2jApPoJgH2Q7NnjiwRdiEPZE?limit=0`
  | 4k | `#nim-science:envs.net` | [🔗 ](https://matrix.to/#/!IpFtPSbgfrZrVcVyti:envs.net/$bDRop4yvjFOl3HMouAyx4mtau-JaAxvBUJ-MqftAm7E?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!IpFtPSbgfrZrVcVyti:envs.net/context/$bDRop4yvjFOl3HMouAyx4mtau-JaAxvBUJ-MqftAm7E?limit=0`
  | 2.3k | `#nim-webdev:matrix.org` | [🔗 ](https://matrix.to/#/!EoyccMaVGwdqyfKMAL:matrix.org/$VdRyNb1F-gLdDP9nDYK7xAh_LcE822fEfDYWz5re8dE?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!EoyccMaVGwdqyfKMAL:matrix.org/context/$VdRyNb1F-gLdDP9nDYK7xAh_LcE822fEfDYWz5re8dE?limit=0`
2022-06-29 05:37:40 -05:00
Eric Eastwood 724c562d6f
Fix image not running in K8s (ROSA) (#24)
Fix image not running in K8s (ROSA)

Trying to solve the container exiting with error code `243`. `npm` throws `243` for any command you try to run with it.

Here is how to get into the container while the entrypoint normally errors.
```
$ oc run -i --tty --image=ghcr.io/matrix-org/matrix-public-archive/matrix-public-archive:sha-65edaea1a9713bb2cc93fa98638327fdeff765cd mpa-test2 --port=3050 --restart=Never --env="DOMAIN=cluster" --command /bin/bash
$ npm start

$ echo $?
243
```

Why does the same image work in Docker?
```
$ docker run -it --entrypoint /bin/bash ghcr.io/matrix-org/matrix-public-archive/matrix-public-archive:sha-65edaea1a9713bb2cc93fa98638327fdeff765cd
$ npm start
# it starts 
```


## Root cause

Seems like it could be either of these issues. Both are `8.6.0`+ problems as well which lines up with the `npm` version available in each base image.

 - https://github.com/npm/cli/issues/4769
 - https://github.com/npm/cli/issues/4996
    - Specific comment about breaking stuff in k8s for other people, https://github.com/npm/cli/issues/4996#issuecomment-1150290804
2022-06-15 18:42:15 -05:00
Eric Eastwood bd5c14242e
Make sure container is able to start up (#23)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/22
2022-06-15 17:12:44 -05:00
Eric Eastwood 780600e3dd
Containerize app (make a Docker image for `matrix-public-archive`) (#22)
Containerize app (make a Docker image for `matrix-public-archive`)

```
ghcr.io/matrix-org/matrix-public-archive/matrix-public-archive
```

... with a variety of tags!


Packages viewable on https://github.com/matrix-org/matrix-public-archive/pkgs/container/matrix-public-archive%2Fmatrix-public-archive
2022-06-15 02:28:18 -05:00
Eric Eastwood 3e85ae3b60
Easy get the app running steps (#21)
Easy get the app running steps:

 1. Install
 2. Edit the config to point at your homeserver
 3. Run the app!
 
---

This is made possible thanks to the change we made in https://github.com/matrix-org/matrix-public-archive/pull/20#discussion_r897567619 to start using a scoped custom version of `hydrogen-view-sdk`. This includes all of the scratch changes necessary to get this project running which makes the `npm install` just work out of the box without all of the `npm link` hassle.
2022-06-15 01:21:10 -05:00
Eric Eastwood cda2d98df0
Add testing to CI (#20)
1. Build test homeserver Docker images which can federate with each other
 2. Run end-to-end (e2e) tests

#### Dev notes

Sharing variables across jobs when the `services` field can't access the `env` context, https://github.community/t/how-to-use-env-with-container-image/17252/24
```yaml
env:
  FOO: bar

jobs:
  set_env:
    outputs:
      var: ${{ steps.save_var.outputs.var }}
    steps:
      - id: save_var
         run: echo "::set-output name=var::${{ env.FOO }}"
  actual_job:
    needs: set_env
    container:
      image: ...whatever_you_need_here...${{ needs.set_env.outputs.var }}
```
2022-06-15 00:59:41 -05:00
Eric Eastwood cead1796b8
Better explain how to get the project and tests to run at this point (#19) 2022-06-10 18:52:12 -05:00
Eric Eastwood 1bd5ecff32
Remove extra body margin causing offset from viewport (#18) 2022-06-10 15:57:22 -05:00
Eric Eastwood 952a3acd4a
Explain how config inherits (#17) 2022-06-09 21:37:07 -05:00
Eric Eastwood 9fc71a3412
Remove `matrix-bot-sdk` usage in tests (#15)
Remove `matrix-bot-sdk` usage in tests because it didn't have timestamp massaging `?ts` and it's not really necessary to rely on since we can just call the API directly 🤷. `matrix-bot-sdk` is also very annoying having to build rust crypto packages.

We're now using direct `fetch` requests against the Matrix API and lightweight `client` object.

All 3 current tests pass 
2022-06-09 20:44:57 -05:00
Eric Eastwood 7dfe8cabc9
Add lightbox support and Hydrogen URL hashes relative to the room (#12)
Add image lightbox support and Hydrogen URL hashes relative to the room

Related to https://github.com/vector-im/hydrogen-web/issues/677

Requires the changes from https://github.com/vector-im/hydrogen-web/pull/749 (split out from https://github.com/vector-im/hydrogen-web/pull/653)

![](https://user-images.githubusercontent.com/558581/172526457-38c108e8-8c46-4e0c-9979-734348ec67fc.gif)


### Hydrogen routing relative to the room (remove session and room from the URL hash)

Before:
Page URL: doesn't work
```html
<div class="Timeline_messageBody">
   <div class="media" style="max-width: 400px">
      <div class="spacer" style="padding-top: 48.75%;"></div>
      <a href="undefined">
          <img src="http://192.168.1.182:8008//_matrix/media/r0/thumbnail/my.synapse.server/RxfuMxEgYcXHKYWERkKVUkqO?width=400&amp;height=195&amp;method=scale" alt="Friction_between_surfaces.jpeg" title="Friction_between_surfaces.jpeg" style="max-width: 400px; max-height: 195px;">
      </a>
      <time>2/24 6:20 PM</time>
   </div>
   <!--node binding placeholder-->
</div>
```

Before (not relative):
Page URL: `http://localhost:3050/!HBehERstyQBxyJDLfR:my.synapse.server/date/2022/02/24#/session/123/room/!HBehERstyQBxyJDLfR:my.synapse.server/lightbox/$17cgP6YBP9ny9xuU1vBmpOYFhRG4zpOe9SOgWi2Wxsk`
```html
<div class="Timeline_messageBody">
   <div class="media" style="max-width: 400px">
      <div class="spacer" style="padding-top: 48.75%;"></div>
      <a href="#/session/123/room/!HBehERstyQBxyJDLfR:my.synapse.server/lightbox/$17cgP6YBP9ny9xuU1vBmpOYFhRG4zpOe9SOgWi2Wxsk">
          <img src="http://192.168.1.182:8008//_matrix/media/r0/thumbnail/my.synapse.server/RxfuMxEgYcXHKYWERkKVUkqO?width=400&amp;height=195&amp;method=scale" alt="Friction_between_surfaces.jpeg" title="Friction_between_surfaces.jpeg" style="max-width: 400px; max-height: 195px;">
      </a>
      <time>2/24 6:20 PM</time>
   </div>
   <!--node binding placeholder-->
</div>
```

After (nice relative links):
Page URL: `http://localhost:3050/!HBehERstyQBxyJDLfR:my.synapse.server/date/2022/02/24#/lightbox/$17cgP6YBP9ny9xuU1vBmpOYFhRG4zpOe9SOgWi2Wxsk`
```html
<div class="Timeline_messageBody">
   <div class="media" style="max-width: 400px">
      <div class="spacer" style="padding-top: 48.75%;"></div>
      <a href="#/lightbox/$17cgP6YBP9ny9xuU1vBmpOYFhRG4zpOe9SOgWi2Wxsk">
          <img src="http://192.168.1.182:8008//_matrix/media/r0/thumbnail/my.synapse.server/RxfuMxEgYcXHKYWERkKVUkqO?width=400&amp;height=195&amp;method=scale" alt="Friction_between_surfaces.jpeg" title="Friction_between_surfaces.jpeg" style="max-width: 400px; max-height: 195px;">
      </a>
      <time>2/24 6:20 PM</time>
   </div>
   <!--node binding placeholder-->
</div>
```
2022-06-08 14:03:36 -05:00
Eric Eastwood 940c73868f
Organize file and add more nodemon logging (#14)
Spawned from https://github.com/vitejs/vite/issues/8492
2022-06-07 20:27:22 -05:00
Eric Eastwood cc958326ea
Use new Hydrogen light-theme asset directly instead of legacy style.css (#13) 2022-06-07 20:21:56 -05:00
Eric Eastwood dff2209e56
Fix nodemon not being able to watch for changes because vite is interfering (#11)
We use `vite@2.9.6` because `vite>=2.9.7` is interfering with `nodemon`, see https://github.com/vitejs/vite/issues/8492

This means that when using `npm run start-dev` -> [`start-dev.js`](40f9d2ea5a/server/start-dev.js), `nodemon` wasn't restarting when a change was made.
2022-06-07 14:56:46 -05:00
Eric Eastwood 40f9d2ea5a
Update to work with latest `hydrogen-view-sdk@0.0.12` (#10)
Get this project running again after a few months of changes
from `hydrogen-view-sdk` and now finally after
https://github.com/vector-im/hydrogen-web/pull/693
merged to make the SDK friendly to locally link and develop on.
2022-06-06 18:58:45 -05:00
Eric Eastwood c34ed9ea0e Fix last event in day not being shown 2022-02-24 13:06:19 -06:00
Eric Eastwood 2153fa4852 Use file that will resolve in new `./assets/*` export
Because we had to update the `assets` `export` to avoid the following deprecation warning:

```
(node:133896) [DEP0148] DeprecationWarning: Use of deprecated folder mapping "./assets/" in the "exports" field module resolution of the package at C:\Users\MLM\Documents\GitHub\element\matrix-public-archive\node_modules\hydrogen-view-sdk\package.json.
Update this package.json to use a subpath pattern like "./assets/*".
(Use `node --trace-deprecation ...` to show where the warning was created)
```

`hydrogen-view-sdk` `package.json` before:
```json
{
  "exports": {
      ".": {
          "import": "./lib-build/hydrogen.es.js",
          "require": "./lib-build/hydrogen.cjs.js"
      },
      "./paths/vite": "./paths/vite.js",
      "./style.css": "./style.css",
      "./assets/": "./asset-build/assets/"
  }
}
```

`hydrogen-view-sdk` `package.json` after:

```json
{
  "exports": {
      ".": {
          "import": "./lib-build/hydrogen.es.js",
          "require": "./lib-build/hydrogen.cjs.js"
      },
      "./paths/vite": "./paths/vite.js",
      "./style.css": "./style.css",
      "./assets/*": "./asset-build/assets/*"
  }
}
```
2022-02-24 13:02:55 -06:00
Eric Eastwood db6d3797d7 Working e2e test 2022-02-24 03:27:53 -06:00
Eric Eastwood 839e31a35e E2E test but still failing because fetching from start of day before test events happened 2022-02-23 21:25:05 -06:00
Eric Eastwood 40fa71df03 Clean up federation test 2022-02-22 20:55:42 -06:00
Eric Eastwood 0f493a241e Federated homeservers in Docker for e2e tests 2022-02-22 20:25:24 -06:00
Eric Eastwood aea382e4f8 WIP: start of tests 2022-02-22 16:06:29 -06:00
Eric Eastwood 7d95681611 Make sure input value is zero padded to parse properly 2022-02-18 12:34:21 -06:00
Eric Eastwood 652b96950b Some grid offset simplications thanks to setting correct 1-based date 2022-02-17 21:50:31 -06:00
Eric Eastwood 4f506f0b7e Fix grid-offset when month starts on Sunday like February 2015 2022-02-17 21:29:27 -06:00
Eric Eastwood b41ee62e59 Add year select fallback for Firefox 2022-02-17 21:10:36 -06:00
Eric Eastwood 6d5ad656b6 Add month/year selector and use UTC date functions 2022-02-17 20:45:47 -06:00
Eric Eastwood fe3f515862 Calendar styles and server hydrogen assets 2022-02-17 16:56:54 -06:00
Eric Eastwood b401cbbc3a Add active date to calendar 2022-02-16 23:08:18 -06:00
Eric Eastwood 6a5d011a45 Link calendar days 2022-02-16 19:58:32 -06:00
Eric Eastwood 8482c1b667 Clarify that Node.js v16 will also work 2022-02-16 17:11:09 -06:00
Eric Eastwood 8160b6d928 Fix reply permalinks having undefined room ID's 2022-02-16 14:56:20 -06:00