Skip to content

Conversation

@DanRoscigno
Copy link

@DanRoscigno DanRoscigno commented Apr 17, 2025

I am submitting this PR to help people (including me!) manage page boosting with DocSearch and the Algolia Crawler. As I do not write Javascript often using Cheerio is not easy for me, so I took good notes while I worked through this and would like to share them with others. I am very happy with the way this is working.

Signed-off-by: DanRoscigno <dan@roscigno.com>
@netlify
Copy link

netlify bot commented Apr 17, 2025

Deploy Preview for docsearch ready!

Name Link
🔨 Latest commit 71402d0
🔍 Latest deploy log https://app.netlify.com/projects/docsearch/deploys/6806b2a95d31ac0008f7370b
😎 Deploy Preview https://deploy-preview-2575--docsearch.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@DanRoscigno
Copy link
Author

@kai687 thanks for your help working through this last year, i finally got around to submitting a PR to document this.

@DanRoscigno
Copy link
Author

Reviewers: I don't generally write JavaScript, so please let me know if I am not following best practices.

Add extraction examples
remove blank lines
@randombeeper
Copy link
Contributor

Hi @DanRoscigno , nice contribution. I totally agree we need a bigger section on boosting, too many people are unaware of it, so it absolutely needs more visibility and explanation.

I was able to scan over the PR this afternoon but not dig too deeply on it. Two things jump out at me - the highlighting of the code seems to look fine for me in dark mode but not so much in light mode, not sure if I have something wrong on my end.

The other thing is that I think this can be simplified by going this approach http://localhost:3000/docs/required-configuration/#introduce-global-information-as-meta-tags. Using slightly different meta tags should remove the need to have to modify the crawler. One down side to this approach is that IMO it is invisible and I could see easily confusing someone - "why do these results seem boosted when the crawler isn't doing anything special here?" I think it could be easy to miss. But we can potentially make that a separate section, as another way to accomplish the same result.

Will revisit again early next week.

@DanRoscigno
Copy link
Author

Thanks James, I will have a look at the highlighting. This is probably a change to your CSS.

Regarding the meta tags, if I am reading the doc correctly I could use:

<meta name="docsearch:pageRank" content="100" />

That would be wonderful. I see that the reactive docs use this. I will give it a go and update the PR with this method. Probably I will put two scenarios in there, I do have a non-standard field that I extract.

@randombeeper
Copy link
Contributor

Yes, exactly, https://dataclient.io/docs and https://resthooks.io/docs are some examples of this.

I think it would be great to cover both approaches since some users will want to do one, or the other, for different potential reasons. But document whichever one(s) you want to do and I fill in the rest after the PR is merged.

@DanRoscigno
Copy link
Author

yup, I just finished testing docsearch:pagerank in the StarRocks docs and reverted the changes I made to my crawler, I will include both in the PR as I am extracting other data in our crawler.

@DanRoscigno
Copy link
Author

@randombeeper can you approve the netlify build so I can see the CSS issue, as I cannot get the docs to build on my end.

Signed-off-by: DanRoscigno <dan@roscigno.com>
@DanRoscigno DanRoscigno marked this pull request as draft April 21, 2025 21:47
@randombeeper
Copy link
Contributor

Hi @DanRoscigno I think we got your docs working locally awhile back. Do you still want to try to finish this out?

@DanRoscigno
Copy link
Author

Hi @DanRoscigno I think we got your docs working locally awhile back. Do you still want to try to finish this out?

Yes, thanks James. On my cal for tomorrow!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants