Commit Graph

231 Commits

Author SHA1 Message Date
Cyberes ebcce1ac2e revert 2023-07-06 16:03:38 -06:00
Cyberes 09492fc1e3 make tagline a link 2023-07-06 16:01:15 -06:00
Cyberes 341faf1aab Modify tagline 2023-07-06 15:56:00 -06:00
Cyberes eaecd6167b missed one line 2023-07-06 15:08:51 -06:00
Cyberes 7879837f25 Merge https://github.com/matrix-org/matrix-public-archive 2023-07-06 15:05:01 -06:00
Eric Eastwood 1d1d7d2d0d Prepare changelog with #276 2023-06-30 03:09:26 -05:00
Eric Eastwood dd2cd9126d
Only show `world_readable` rooms in the room directory (#276)
Happens to address part of https://github.com/matrix-org/matrix-public-archive/issues/271
but made primarily as a follow-up to https://github.com/matrix-org/matrix-public-archive/pull/239

---

Only 42% rooms on the `matrix.org` room directory are `world_readable` which means we will get pages of rooms that are half-empty most of the time if we just naively fetch 9 rooms at a time.

Ideally, we would be able to just add a filter directly to `/publicRooms` in order to only grab the `world_readable` rooms and still get full pages but the filter option doesn't allow us to slice by `world_readable` history visibility.

Instead, we have to paginate until we get a full grid of 9 rooms, then make a final `/publicRooms` request to backtrack to the exact continuation point so next page won't skip any rooms in between.

---

We had empty spaces in the grid before because some rooms in the room directory are private which we filtered out before. But that was a much more rare experience since only 2% of rooms were private .
2023-06-30 03:08:32 -05:00
Eric Eastwood a26b852c5a Prepare changelog with #279 2023-06-29 18:59:54 -05:00
Eric Eastwood 59c9d3180e
Fix `18+` false positives with NSFW check (#279)
Was noticing false positives with our test room names like: `planet-1688081266353-room-18`

Before:
```regex
/(\b|_)18+(\b|_)/i
```

After:
```regex
/(\b|_|-|\s|^)18\+(\b|_|-|\s|$)/i
```
2023-06-29 18:58:53 -05:00
Eric Eastwood 5de8cb4e35 Prepare changelog with #278 2023-06-28 20:30:31 -05:00
Eric Eastwood 0fc4421432
Indicate when the room was set to `world_readable` and by who (#278) 2023-06-28 20:29:49 -05:00
Eric Eastwood a79342f83c Prepare changelog with #277 2023-06-28 18:15:18 -05:00
Eric Eastwood 3b378675c3
Update FAQ to explain `world_readable` only (#277)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/239
2023-06-28 18:14:31 -05:00
Eric Eastwood ff18a46283 Attribute #239 author 2023-06-28 00:24:20 -05:00
Eric Eastwood 3df0f00809 Prepare changelog with #239 and #275 2023-06-27 17:18:25 -05:00
Eric Eastwood 2243db5fff
Fix eslint trying to look at `node_modules/` (#275)
I'm not sure what exactly is causing the behavior change now besides that I am running on Linux.

The quote fix around the path is from https://stackoverflow.com/questions/51021751/express-js-lint-gives-mistake

`node_modules/` is already part of our `.eslintignore`.

Previously:

```sh
$ npm run lint

> matrix-public-archive@0.1.0 lint
> eslint **/*.js

Oops! Something went wrong! :(

ESLint: 8.37.0

You are linting "node_modules/ipaddr.js", but all of the files matching the glob pattern "node_modules/ipaddr.js" are ignored.

If you don't want to lint these files, remove the pattern "node_modules/ipaddr.js" from the list of arguments passed to ESLint.

If you do want to lint these files, try the following solutions:

* Check your .eslintignore file, or the eslintIgnore property in package.json, to ensure that the files are not configured to be ignored.
* Explicitly list the files from this glob that you'd like to lint on the command-line, rather than providing a glob as an argument.
```
2023-06-27 17:16:16 -05:00
Tulir Asokan 1d3e930fbd
Don't allow previewing `shared` history rooms (#239)
Only `world_readable` can be considered as opting into having history publicly on the web. Anything else must not be archived until there's a dedicated state event for opting into archiving.
2023-06-27 16:56:58 -05:00
Cyberes c493ca18c9 re-brand 2023-06-26 19:25:25 -06:00
Cyberes 9dd1933638 Remove join reason. Remove opt out in FAQ 2023-06-26 19:10:47 -06:00
Eric Eastwood e4800852ff Prepare changelog with #269 2023-06-22 02:24:25 -05:00
Eric Eastwood dd27c1054a
Prefer canonical alias in `rel=canonical` link (#269)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/266

Part of https://github.com/matrix-org/matrix-public-archive/issues/251
2023-06-22 02:23:48 -05:00
Eric Eastwood 79df21b0f6 Prepare changelog with #268 2023-06-22 01:55:57 -05:00
Eric Eastwood aff0423f4c
Prevent join event spam with stable `reason` (#268)
Fix https://github.com/matrix-org/matrix-public-archive/issues/267

In the case of someone visiting a room via an alias, we can't get access to the `room_id` before we join the room. I've opted to just point to the Matrix Public Archive instance in general. This way the `join` reason is always stable regardless how someone is visiting the room.

Join `reason` was originally added in https://github.com/matrix-org/matrix-public-archive/pull/262
2023-06-22 01:55:21 -05:00
Eric Eastwood c5debf4f7a Prepare changelog with #266 2023-06-22 01:51:44 -05:00
Eric Eastwood 0f522bed20
Use `rel=canonical` link to de-duplicate event permalinks (#266)
Fix https://github.com/matrix-org/matrix-public-archive/issues/251
2023-06-22 01:50:55 -05:00
Eric Eastwood 3414fcf7b2 Prepare changelog with #265 2023-06-21 20:30:38 -05:00
Eric Eastwood cf51d04433
Add /faq` redirect (#265)
Part of https://github.com/matrix-org/matrix-public-archive/issues/257
so we can set the display name of the bot to `archive.matrix.org/faq` and
people can read about the project is about and why the bot joined.
2023-06-21 20:29:26 -05:00
Eric Eastwood fbd23d2d91 Prepare changelog with #262 2023-06-09 16:07:26 -05:00
Eric Eastwood 1dd63212c0
Add reason why the archive bot is joining the room (#262)
Using the join `reason` added in [MSC2367](https://github.com/matrix-org/matrix-spec-proposals/pull/2367). Unfortunately, this PR doesn't have much effect because it doesn't look like many clients support it yet (Element doesn't support it for example).

Part of https://github.com/matrix-org/matrix-public-archive/issues/257
2023-06-09 16:05:20 -05:00
Eric Eastwood 8da9b3d957 Prepare changelog with #263 2023-06-06 11:34:46 -05:00
Eric Eastwood 9d55b4a505
Remove libera.chat as a default since their rooms are not accessible in the archive (#263)
The history visibility in Libera rooms is set to `join` which means it's not
accessible in the archive at all. Instead of leading a bunch of people to
`403 Forbidden`, we can just remove it from the default list.

The default list was mostly just copied from the Element list of defaults.
2023-06-06 11:33:56 -05:00
Eric Eastwood 9d3b1766ad Prepare changelog with #261 2023-06-02 17:21:06 -05:00
Eric Eastwood c26bdc5ffb
Fix Firefox sorting room cards in the wrong direction (#261)
Room cards will now sort by room members descending (highest to lowest) as expected. 

Fix https://github.com/matrix-org/matrix-public-archive/issues/218

The `/publicRooms` (room directory) endpoint already returns rooms in the correct order which is why we didn't care about the order before but the different `[].sort(...)` implementations in browsers necessitates we be explicit about it. Ideally, we wouldn't have to use the `ObservableMap.sortValues()` method at all but it seems like one of the only ways to get the values out. In any case, maybe it's more clear what order things are in now.

This bug stems from the fact that `[1, 2, 3, 4, 5].sort((a, b) => 1)` returns different results in Chrome vs Firefox (found from https://stackoverflow.com/questions/55039157/array-sort-behaves-differently-in-firefox-and-chrome-edge)

 - Chrome: `[1, 2, 3, 4, 5].sort((a, b) => 1)` -> `[1, 2, 3, 4, 5]`  
 - Firefox: `[1, 2, 3, 4, 5].sort((a, b) => 1)` -> `[5, 4, 3, 2, 1]` 
2023-06-02 17:19:58 -05:00
Eric Eastwood dfeae90829
Link FAQ about indexing in the right-panel footer (#258)
Link FAQ about indexing in the right-panel footer so people can more easily
understand what goes into the result and find issues to track about opting out.

 - https://github.com/matrix-org/matrix-public-archive/issues/47
 - 5caf9dc1b8/docs/faq.md (how-do-i-opt-out-and-keep-my-room-from-being-indexed-by-search-engines)
2023-05-31 01:23:39 -05:00
Eric Eastwood 5caf9dc1b8
Add context and demystify public/world_readable/guest/peeking in the FAQ (#241)
Add context and demystify public/world_readable/guest/peeking in the FAQ

Spawning from:

 - https://github.com/matrix-org/matrix-public-archive/issues/47#issuecomment-1568698809
 - https://matrix.to/#/!SzoPnANsRYxITaDPaJ:matrix.org/$Zwr_GzklOjhRoAY-D9Ekh0qYhZYUWI_d5HUcJ1180zM?via=matrix.org&via=evulid.cc&via=t2l.io
 - https://matrix.to/#/!QQpfJfZvqxbCfeDgCj:matrix.org/$ZKgZ6oPhW39gByfORAxp-zfL_g6lISL73Ms_6D16SPQ?via=matrix.org&via=element.io&via=envs.net
 - https://matrix.to/#/!SzoPnANsRYxITaDPaJ:matrix.org/$eC8O8zFwsvEkoy2kdwrNowKhvCg_kCU7zZhjSlSEGto?via=matrix.org&via=evulid.cc&via=t2l.io
2023-05-30 13:57:27 -05:00
Eric Eastwood 4797f1e46a
Document why changes to locally linked hydrogen-view-sdk don't trigger a rebuild (#240) 2023-05-30 10:34:35 -05:00
Eric Eastwood ff4c948518
Update Vite while debugging some unrelated issues (#232)
Still good to have the latest fixes in any case
2023-05-26 17:53:59 -05:00
Eric Eastwood 68824ce4db Prepare changelog with #231 2023-05-23 12:30:19 -05:00
Eric Eastwood a9aa08f24a
Catch NSFW rooms with underscores (#231)
`\b` includes `_` which is a bit unexpected since we expect a room that looks like `NSFW_foo` to be caught in safe search
2023-05-23 12:29:19 -05:00
Eric Eastwood f05d36e9f4
Fix mistake in config access for workaroundCloudflare504TimeoutErrors (#229)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/228
2023-05-11 16:34:16 -05:00
Eric Eastwood e96f36a1a6 Prepare changelog with #228 2023-05-11 16:26:02 -05:00
Eric Eastwood 55f1867c68
Prevent Cloudflare from overriding our own 504 timeout page (#228)
Explored in https://gitlab.matrix.org/matrix-public-archive/deployment/-/issues/2 (internal deployment issue)

> Cloudflare returns an Cloudflare-branded HTTP 502 or 504 error when your origin web server responds with a standard HTTP 502 bad gateway or 504 gateway timeout error:
>
> *-- https://developers.cloudflare.com/support/troubleshooting/cloudflare-errors/troubleshooting-cloudflare-5xx-errors/#502504-from-your-origin-web-server*

<img src="https://github.com/matrix-org/matrix-public-archive/assets/558581/46f6d88c-ba53-4efb-809f-3f331bf9b799" width="400">


The only way to disable this functionality is to have an Enterprise Cloudflare plan and use the `Enable Origin Error Pages` option:

> **Enable Origin Error Pages**
>
> When Origin Error Page is set to “On”, Cloudflare will proxy the 502 and 504 error pages directly from the origin.
>
> Requires Enterprise or higher

So instead of dealing with that headache, we're just working around this by responding with a 500 error when we timeout. Should be good enough I think. The user won't know any difference but may affect what Search Engines think. Not sure search engines care about the distinction since the page is slow to respond anyway which they punish.
2023-05-11 16:24:58 -05:00
Eric Eastwood bf3ca52c3b 0.1.0 2023-05-11 15:38:21 -05:00
Eric Eastwood 0bd6454afa
Prepare changelog with initial release (#227) 2023-05-11 15:38:11 -05:00
Eric Eastwood f3fb3e02ec
Update Hydrogen SDK to include MXC URL's on media (#226)
Useful in moderation scenarios where you want to quarantine media and can quickly/easily look at the data attribute on the media (image/videos). Or simply write a little script to extract all of the `data-mxc-url` and `data-thumbnail-mxc-url` attributes.

ex.
```
<img src="http://localhost:8008/_matrix/media/r0/thumbnail/my.synapse.server/TEyTVUNgvUQZcXlrLwgGfbcp?width=400&amp;height=266&amp;method=scale" alt="Stormclouds.jpg" title="Stormclouds.jpg" data-mxc-url="mxc://my.synapse.server/kxibKhxRfTvFWyuWwWvFuBtE" data-thumbnail-mxc-url="mxc://my.synapse.server/TEyTVUNgvUQZcXlrLwgGfbcp" style="max-width: 400px; max-height: 266px;">
```
2023-05-11 02:08:56 -05:00
Eric Eastwood 1a140b39c6
Better grammar in URL preview description (#225)
Part of https://github.com/matrix-org/matrix-public-archive/issues/202
2023-05-10 01:12:49 -05:00
Eric Eastwood 16323df054
Add image metadata for URL previews (#224)
- Default to a nice `[matrix]` banner
    -  There is room for improvement here when the Matrix Public Archive gets it's own logo (https://github.com/matrix-org/matrix-public-archive/issues/94) and maybe says "Matrix Public Archive" somewhere in the banner.
    - This is good enough for now (and certainly better than downstream previews using the first image on the page).
 - For rooms, it will use the room avatar

Part of https://github.com/matrix-org/matrix-public-archive/issues/202

Image is sized to 1200x630 to match conventions of `og:image`.

Crafted the banner image by modifying the header on the room directory homepage and taking a node screenshot. Page zoom @ 175%
2023-05-10 00:50:12 -05:00
Eric Eastwood bf8040f48e
Fix checkbox being checked by default when the value was actually `null` (#221)
Fix `debugActiveDateIntersectionObserver` checkbox being checked by default when the value was actually `null`.
Now we properly only care about explicit `'true'`, `'false'` from local storage.

Before:
```
<input id="debugActiveDateIntersectionObserver" type="checkbox" checked="null">
```

After:
```
<input id="debugActiveDateIntersectionObserver" type="checkbox">
```
2023-05-05 19:44:49 -05:00
Eric Eastwood ed3fde7845
Various updates to put `archive.matrix.org` in the forefront (#220)
Fix https://github.com/matrix-org/matrix-public-archive/issues/212

Screenshot at 90% zoom with even dimensions for better scaling
2023-05-05 17:42:28 -05:00
Eric Eastwood 198e8c09be
Mark NSFW room pages with `<meta name="rating" content="adult">` (#216)
Related docs:

 - https://developers.google.com/search/docs/crawling-indexing/safesearch
 - https://developers.google.com/search/docs/crawling-indexing/special-tags
2023-05-05 15:36:26 -05:00